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Survey  Manual  for  Estimating  the  Incidence  of  Lead  Paint  in  Housing 
William  G.  Hall  $  Lillian  T,  Slovic 

ABSTRACT 

This  Manual  is  intended  as  a  guide  for  municipal  managers  in  per- 
forming a  survey  to  determine  the  prevalence  of  lead  based  paint  in 
their  community's  dwelling  units.  There  are  four  parts  to  the  Manual, 
each  is  intended  for  a  different  audience. 

Part  I  discusses  the  preliminary  considerations  for  a  survey.  It 
is  intended  for  the  department  head  or  executive  who  will  initiate  plans 
for  the  survey.  It  presents  a  managerial  overview  of  the  processes,  the 
cost  determinants,  criteria  for  the  establishment  of  objectives  and  the 
resources  required. 

Part  II  is  intended  for  the  survey  manager  and  the  inspector  super- 
visors. It  contains  more  detailed  information  on  the  planning,  staffing, 
training,  and  execution  of  the  data  collection  phase  of  the  survey. 

Part  III  is  for  the  use  of  the  person  responsible  for  the  control 
and  management  of  the  data  collected  and  for  the  analysis  of  these  data. 

The  Appendices  contain  quite  detailed  information  about  procedures 
we  have  used  in  previous  surveys.  These  may  be  used  as  they  are  described 
or  may  be  modified  or  adapted  to  meet  specific  objectives. 


Key  Words:  Lead  paint;  lead  paint  detection;  lead  paint  programs;  lead 
poisoning;  portable  x-ray  fluorescence;  random  sampling;  x-ray  fluorescence. 


INTRODUCTION 

Lead  poisoning,  especially  of  children,  is  a  serious  national  health 
problem.  The  sources  of  lead  to  which  children  are  exposed  include:  air 
(industrial  and  automobile  exhausts);  water  (industrial  waste,  minerals, 
and  leaded  pipes) ;  food  (packaging  or  storage  containers ;  the  food 
product  itself) ,  or  dust  (particulate  fallout) .  Another  source  of  lead 
intake  in  children  is  the  ingestion  of  lead  based  paint. 

Before  the  1940's,  paints  with  as  much  as  50  percent  lead  were  widely 
used  for  residential  applications.  According  to  the  1970  Housing  Census  1/, 
approximately  1.8  million  houses,  which  were  built  before  1940,  are  in  a 
dilapidated  condition.  Dilapidation  denotes  chipping,  flaking  and  peeling 
paint  as  well  as  decaying  walls  and  wood  surfaces.  Under  these  conditions 
lead  paint  if  present  is  readily  available  for  inhalation  and  ingestion 
and  constitutes  a  potential  health  hazard. 

Accurate  diagnosis  and  careful  treatment  of  lead  poisoned  children 
will  not  eliminate  the  problem  if  they  are  returned  to  the  same  environ- 
ment to  be  reexposed  to  the  hazard.  The  rate  of  recurrence  of  lead 
poisoning  is  high  among  such  children.  Reexposure  of  survivors  of 
acute  poisoning,  to  an  unabated  lead  source,  results  in  permanent 
damage  almost  100  percent  of  the  time  2/ 


1/  Census  Bureau,  U.  S.  Department  of  Commerce,  Plumbing  Facilities 
and  Estimates  of  Dilapidated  Housing,  HC  (6) ,  1973 

2/  Jane  S.  Lin-Fu,  "Childhood  Lead  Poisoning- -An  Eradicable  Disease" 
Children':  An  Interdisciplinary  Journal  for  the  Professions  Serving 
Children,  Vol.  17,  No.  1,  January  -  February  1970. 


The  only  sure  preventative  for  the  ingestion  of  lead  paint,  on  housing 
surfaces,  by  children,  is  elimination  of  the  hazard,  either  by  removing 
it  or  covering  it  up  with  barrier  type  building  materials.  This  is  an 
expensive  procedure --costing  from  hundreds  up  to  several  thousands  of 
dollars  per  dwelling- -but  perhaps  not  so  expensive  for  society  as  the 
care  of  youngsters  who  have  suffered  permanent  physical  or  brain 
damage  from  lead  paint  poisoning. 

If  a  serious  lead  paint  poisoning  problem  has  been  identified  in 
your  community  through  discovery  of  a  high  lead  poisoning  incidence  and/or 
by  a  child  screening  program,  you  may  be  looking  for  a  strategy  to  combat 
it,  either  as  an  independent  lead  poisoning  control  program  or  as  part  of  a 
broader  housing  improvement  effort  such  as  a  code  enforcement  program. 
Either  way,  lead  paint  hazard  abatement  will  affect  your  overall  housing 
program  in  terms  of  funding  and  staff  commitments.  You  may  find  that  the 
addition  of  lead  hazard  elimination  as  part  of  your  housing  improvement 
program  makes  a  raze-rebuild  strategy  more  cost  effective  than  a  rehabili- 
tation strategy,  especially  if  the  latter  seemed  only  marginally  feasible 
on  economic  grounds.  Whatever  course  of  action  is  anticipated,  the  first 
step  is  to  estimate  the  number  of  contaminated  housing  units,  the  extent 
of  contamination,  and  housing  characteristics  which  may  be  associated  with 
lead  paint  hazards  (location,  age,  occupancy  class,  type,  etc.).  The 
sample  survey  techniques  described  in  this  manual  can  help  you  accomplish 
this  much  more  conveniently  and  with  less  expense  than  the  alternative- -a 
unit  by  unit  census. 

Although  this  Manual  is  not  intended  to  be  used  as  a  general  guide 
for  household  surveys,  some  of  the  material  is  applicable  to  such  acti- 
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vities.  The  text  includes  description  of  ways  in  which  this  survey 
differs  from  the  usual  opinion  or  household  surveys;  the  focus  on 
dwellings  rather  than  people,  the  collection  of  physical  measurements 
rather  than  interviews   are  distinguishing  characteristics  of  this  survey. 
A  detailed  procedure  for  drawing  a  sample  is  presented.  Although  it  is 
by  no  means  the  only  procedure  which  can  be  used  it  was  designed  to  meet 
the  specific  objectives  of  estimating  the  hazard  in  a  number  of  housing 
categories  with  a  high  statistical  confidence  level,  with  moderate 
expenditure  of  resources. 


PART  I 
PRELIMINARY  CONSIDERATIONS 


OVERVIEW  OF  THE  SURVEY  PROCESS 
There  are  many  possible  motivations  for  municipal  officials  to 
consider  performing  a  lead  paint  survey  in  housing.  The  survey  may  be 
health- oriented  or  code  enforcement  oriented;  it  may  be  part  of  a  large 
neighborhood  or  civic  improvement  program  or  an  independent  lead  abate- 
ment program;  it  may  be  focused  on  the  entire  city  or  some  particular 
target  group  or  groups  for  which  the  city  has  special  legal  or  social 
responsibilities.  The  findings  of  the  survey  may  be  required  either 
within  an  agency  or  at  a  higher  echelon  of  government;  they  may  be 
used  to  allocate  existing  resources  more  efficiently  in  an  operational 
sense,  to  permit  cost  effective  long  range  planning  for  a  lead  program, 
or  to  estimate  the  benefits  and  costs  of  a  lead  abatement  program  relative 
to  the  benefits  and  costs  of  alternative  action  programs.  Regardless  of 
the  conditions  leading  to  the  use  of  a  sample  survey,  the  following  tasks 
should  be  accomplished  sequentially  to  assure  that  the  survey  will  be 
efficient  and  adequate. 

1)  Determine  the  objectives. .  .Why  is  the  information  needed?  What 
decisions  will  be  based  on  the  information  gathered? 

2)  Define  the  housing  units  to  be  studied. .  .Are  all  housing  units 
to  be  included?  Only  owner -occupied  units?  Only  rental  units? 

3)  Determine  the  data  to  be  collected  and  verify  that  they  do  not 
exist  elsewhere. . .What  data  are  needed?  Can  they  be  gathered 
by  using  data  from  preceding  or  current  studies? 

4)  Choose  the  sample  unit ...  Should  it  be  the  building?  The  dwelling 
unit?  What  size  sample  is  needed?  What  sampling  method  is  to 

be  used?  What  degree  of  uncertainty  can  be  tolerated? 
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5)  Design  and  test  the  survey  questionnaire  or  data  collection 
form. ..Is  it  unambiguous?  Is  it  clear?  Does  it  minimize 
discretionary  entries?  Is  it  convenient  for  inspectors?  Is  it 
convenient  for  editing,  transcription,  and  keypunching?  Should 
abbreviations  and/or  conventions  appear  on  the  form  or  in  a 
companion  manual? 

6)  Determine  the  method  of  resident  contact ...  Should  appointments 
be  made  for  inspectors  or  should  they  make  unannounced  calls? 
How  should  publicity  be  handled? 

7)  Acquire  and  train  staff 

8)  Carry  out  the  data  collection 

9)  Edit  the  data  for  consistency  and  completeness. 

10)  Deal  with  nonresponses  (units  for  which  inspection  cannot  be 
performed.) 

11)  Analyze  the  data 

12)  Report  the  findings,  including  recommendations 

Although  these  steps  are  in  sequence,  it  should  be  understood  that 
they  are  not  independent. 

Many  skills  and  talents  are  required  and  they  must  be  well  coordi- 
nated if  the  survey  is  to  be  successful.  A  useful  procedure  is  to  have 
all  who  are  expected  to  participate  in  the  survey,  or  use  the  data  collected 
engage  in  a  series  of  give-and-take  discussions  during  the  planning  stages. 
"What  if"  questions  should  be  encouraged.  These  discussions  should  result 
in  a  useful,  practical  and  economical  survey  design. 

COST  DETERMINANTS 

The  most  critical  determinant  of  survey  cost  is  the  size  (number  of 
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dwelling  units  to  be  inspected)  of  the  sample.  The  sample  siz  %  is  also 
the  most  important  factor  in  the  accuracy  o,f  the  survey.  There  is  there- 
fore a  direct  trade-off  of  cost  vs.  accuracy.  The  accuracy  requirements 
must  be  understood  by  all  involved;  insistence  on  more  accuracy  than  that 
required  will  result  in  unnecessary  expense;  less  accuracy  than  is 
required  will  result  in  an  inadequate  or  even  useless  survey. 

The  following  is  a  list  of  functions  which  may  be  considered  as 
distinct  elements  for  costing: 

1.  Planning  and  design 

2 .  Management 

3.  Development  of  population  list 

4.  Selection  of  sample 

5.  Development  of  data  collection  form. 

6.  Acquisition  of  lead  measurement  devices 

7 .  Pretest 

8.  Printing 

a)  Data  collection  forms 

b)  Training  materials 

c)  Manuals 

d)  Contact  letters 

e)  Call  back  forms 
9)  Inspection 

a)  Recruitment 

b)  Training 

c)  Dwelling  inspection 

d)  Inspector  supervision 

e)  Inter-unit  travel 
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10)  Development  of  editing,  coding  and  analysis 
procedures  (manual  or  computerized) 

11)  Editing,  coding,  and  keypunching 

12)  Telephone 

13)  Mailing 

14)  Analysis  of  data  including  programming  and  computer  costs. 

15)  Report  writing 

16)  Reproduction,  dissemination,  and  presentation  of  report 
As  with  the  steps  of  the  survey  process,  these  functions  are 

distinct  but  interdependent.  The  size  of  the  sample  will  influence  the 
methods  by  which  the  functions  are  to  be  accomplished.  The  sample  size 
controls  the  scale  of  the  entire  operation.  If  the  sample  size  is  small, 
for  example,  editing  will  be  manual  rather  than  computerized,  documents 
can  be  reproduced  by  office  equipment  rather  than  printing,  a  smaller 
organization  is  required. 

ESTABLISHMENT  OF  OBJECTIVES 
The  first,  most  important,  and  most  difficult  step  of  a  survey  is 
the  establishment  of  objectives.  The  objectives  must  be  defined  in  terms 
of  the  purpose  or  purposes  for  which  the  collected  data  are  to  be  used. 
The  survey  should  provide  a  sound  basis  for  decision  making.  The  decision 
could  be:  the  selection  of  a  course  of  action  from  alternative  proposals; 
defining  a  course  of  action  or,  allocation  of  resources.  It  cannot 
be  overemphasized  that  the  establishment  of  objectives  is  a  policy 
function  and  one  that  should  be  performed  at  a  high  policy  making  level. 
Although  the  services  of  a  survey  statistician  are  required  in  the  process, 
his  role  is  to  quantify  and  interpret,  and  to  estimate  costs.  His  job 
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is  not  to  establish  objectives.  The  interaction  between  the  statistician 
and  the  policy  maker  must  be  collaborative. 

REQUIRED  RESOURCES 
General 

The  survey  can  be  carried  out  completely  with  "in-house"  staff. 
However  there  are  some  tasks,  within  the  survey,  which  could  be  contracted 
to  individuals,  firms,  or  other  parts  of  your  agency.  These  opportunities 
are  identified  below. 
Supervisors 

Choosing  supervisors  is  extremely  important  for  their  performance 
is  crucial  to  the  success  of  the  project.  They  are  responsible  for 
hiring  and  training  the  inspectors,  maintaining  the  work  flow,  scheduling 
inspection  team  assignments,  checking  the  equipment,  reviewing  forms, 
and  following  departmental  policies  and  procedures. 

Ideally  one  would  like  the  supervisors  to  have  experience  in  survey 
supervision  and  to  be  knowledgeable  in  housing  and  public  health  matters. 
The  former,  however,  is  the  more  critical.  These  people  may  have  to  be 
recruited  from  outside  the  department,  or  the  most  nearly  qualified  candi- 
dates may  have  to  be  trained  in  one  or  more  of  the  critical  areas. 

Some  projects  have  hired  university  graduate  students  in  the  appro- 
priate fields  for  these  supervisory  positions.  However,  in  our  experience 
this  has  not  been  a  satisfactory  approach.  The  students  often  have  little 
work  background,  even  less  supervisory  experience  and  have  difficulty 
maintaining  authority  over  people  of  their  own  age  and  educational  back- 
ground. 

The  capability  of  people  that  you  hire  for  these  positions  will,  of 
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course,  depend  on  the  candidates  available.  Since  these  are  very  import- 
ant positions  for  the  survey,  every  effort  should  be  made  to  obtain  the 
most  qualified  candidates. 

One  supervisor  has  been  found  to  be  appropriate  for  each  group  of  six 
inspection  teams,  that  is,  one  supervisor  for  twelve  inspectors. 
Data  Base  Manager 

The  data  base  manager  (computer  analyst)  is  responsible  for  program- 
ing, designing  and  operating  the  system  which  stores  the  data  for 
future  analysis.  This  person  might  be  detailed  from  a  budgeting 
department  or  some  other  department  of  your  agency  for  the  duration  of 
the  project  rather  than  hired  from  the  outside. 

This  is  one  of  the  functions  which  could  be  performed  by  an  outside 
contractor,  such  as  a  reliable  data  analysis  group  either  inside  or  out- 
side of  government.  The  same  contract  could  also  include  the  services 
of  a  survey  statistician  to  design  and  perform  the  data  analysis. 
Inspection  Staff 

The  number  of  inspectors  needed  can  be  determined  from  the  number 
of  houses  to  be  inspected.  For  instance,  if  5-6  completed  inspections 
per  day  per  two -man  team  are  assumed  (in  our  experience  this  is  a 
reasonable  estimate),  then  165-200  team  days  are  needed  to  complete  1000 
inspections.  If  the  inspectors  are  hired  for  a  three  month  period,  they 
will  be  able  to  work  60-65  days.  Thus,  3  teams  would  be  needed  to  perform 
1000  inspections  during  those  three  months.  These  estimates  are  based 
on  a  five  day-forty  hour  work  week,  but  the  inspection  teams  will  in 
fact  be  required  to  perform  some  inspections  during  evenings  and  weekends 
at  the  convenience  of  the  occupant. 
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Statistical  Consultant 

A  survey  statistician  will  be  needed  at  various  times  during  the 
planning  and  execution  of  the  survey.  During  the  planning  stage,  he 
should  be  involved  in  the  establishment  of  objectives;  in  discussing  all 
of  the  questions  concerning  the  sample  and  in  the  design  and  testing  of 
the  questionnaire.  The  statistician  should,  with  the  data  base  manager 
or  the  data  analysis  group,  design  the  analysis  procedures  to  be  used. 

If  possible  the  statistician  should  be  continuously  available  on  a 
consultant  basis,  both  during  the  survey,  to  help  in  solving  whatever 
problems  arise,  and  during  the  analysis  phase  to  help  interpret  the 
results.  Possible  sources  for  this  consultant  include  your  agency,  a 
nearby  university  or  a  commercial  firm. 
Clerk/Telephonist 

Usually  a  department  will  have  a  clerical  worker  who  can  become  part 
of  the  lead  survey  group.  In  addition  to  maintaining  record  files,  that 
person  also  can  be  in  charge  of  mailing,  maintaining  mailing  records,  and 
arranging  inspection  appointments.  Just  as  the  inspectors  may  have  to  work 
Saturdays  and  evenings,  so  might  the  telephone  scheduler.  If  all  the  adults 
in  a  household  work,  the  only  time  they  can  be  reached  for  an  appointment  is 
at  night  or  on  the  weekend. 
Equipment 

Portable  X-ray  fluorescence  (XRF)  lead  detectors  are  the  most  commonly 

used  and  acceptable  means  for  making  large  numbers  of  measurements  of 

lead  paint  in  housing.  They  are  non-destructive,  and  an  inspector  can 

perform  a  measurement  every  10-20  seconds.  A  digital  readout  displays 

the  measurement  in  terms  of  milligrams  of  lead  per  square  centimeter  of 

surface  area  (mg/cm  ) .  These  devices  are  limited  in  accuracy  and  durability. 
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They  will  measure  the  lead  content  of  surfaces  with  good  reliability  (with  an 
error  of  less  than  10%)  above  a  level  of  2  mg/cm  .  Below  that  level,  readings 
are  suspect  but  can  be  confirmed  with  multiple  readings  at  the  same  location. 

New  XRF  devices  of  increased  accuracy  and  improved  general  performance 
are  under  commercial  development.  If  present  generation  instruments  are 
used  for  the  survey,  it  would  be  advisable  to  have  a  repair  service 
agreement  with  the  manufacturer.  Each  survey  team  will  need  a  detector. 
In  addition,  one  or  two  spare  detectors  should  be  kept  on  hand  to  substi- 
tute for  those  returned  to  the  manufacturer  for  repair. 

There  are  other  methods  that  can  be  used  for  the  detection  of 
lead  paint  in  houses.  Chemicals  such  as  sodium  sulfide  will  react 
with  lead  compounds  to  form  dark  colors  which  are  indicative  of  the 
presence  of  lead.  Chemical  spot  tests,  however,  are  not  quantitative 
and  can  only  give  a  rough  indication  of  lead's  presence.  Under  some 
circumstances  they  will  give  false  negative  reactions  to  lead,  that  is, 
indicate  that  no  lead  is  present  when  in  fact  it  is  present. 

Lead  in  paint  can  be  accurately  analyzed  in  the  laboratory  using 
sophisticated  chemical  techniques  and  instruments.  Such  procedures, 
however,  require  that  paint  chips  be  scraped  or  peeled  from  a  number 
of  surfaces,  catalogued  and  transmitted  to  a  laboratory  for  analysis. 
These  alternatives  to  the  use  of  XRF  detectors,  in  addition  to  being  very 
time  consuming  and  expensive,  are  likely  to  be  unacceptable  to  most 
dwelling  occupants. 
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PART  II 


PREPARING  FOR  AND  PERFORMING  THE  SURVEY 
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SELECTING  THE  SAMPLE 

Simple  random  sampling  from  a  city  housing  director  1/  has  been 
found  to  be  an  efficient  means  of  choosing  residences  to  survey.  It  is 
the  method  least  likely  to  be  affected  by  bias.  Every  unit  has  an  equal 
likelihood  of  being  selected,  with  prior  selections  having  no  bearing 
on  subsequent  choices. 

Housing  directories  are  available  for  many  United  States  cities  in 
the  population  range  of  10,000  to  1,000,000  and  can  be  found  at  libraries 
and  real  estate  offices.  They  are  updated  every  two  or  three  years.  In  a 
lead  based  paint  survey,  where  the  emphasis  is  on  the  older  housing  stock, 
the  omission  of  new  housing  units  built  after  the  most  recent  directory 
publication  can  be  overcome  by  including  a  correction  for  population  size 
in  the  analysis  or  by  using  a  better  population  size  estimate  from  a  more 
up-to-date  source. 

Most  directories  are  divided  into  two  sections:  the  first  is  an 
alphabetical  listing  of  all  city  residents  and  their  addresses;  the  second, 
an  alphabetical  listing  of  streets  with  the  street  numbers  listed  numerically. 
In  conducting  the  survey,  it  is  easier  to  go  from  street  to  street  than 
from  name  to  name,  and  so  the  second  directory  section  is  more  useful. 

Errors  do  exist  in  these  lists,  but  our  experience  has  been  that  there 
are  not  enough  of  them  to  invalidate  the  analysis.  Someone  with  a  fair 
knowledge  of  the  city  should  make  a  cursory  check  of  the  directory,  however, 
to  assure  that  no  major  publication  errors  exist,  such  as  the  omission  of 
an  entire  section  of  the  city.  In  any  case  it  is  ultimately  the  respon- 
sibility of  the  survey  statistician  to  reconcile  the  population  data  source 
to  the  survey  requirements. 

1/Several  procedures  from  drawing  samples  are  described  in  Appendix  A. 
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Random  Sampling  and  the  Binomial  Distribution 

A  door  to  door,  unit  by  unit  census  of  all  dwelling  units  in  a  city, 
to  determine  the  extent  and  distribution  of  the  community's  lead  paint 
hazard  is  an  extremely  costly  and  time  consuming  process  which  most  juris- 
dictions will  want  to  avoid.  Random  sampling  is  faster  and  more  economical 
and  can  yield  very  accurate  results. 

Random  sampling  involves  choosing  a  fraction  of  all  the  dwelling  units 
from  some  population  in  such  a  manner  that  each  unit  has  an  equal  chance  of 
being  selected.  By  applying  the  mathematics  of  probability  to  the  sample 
findings,  the  relevant  characteristics  (lead  paint  content  in  this  case) 
for  the  totality  of  dwellings  can  be  estimated.  In  a  lead  paint  survey, 
it  may  be  desirable  to  deal  with  a  number  of  subpopulations .  The  incidence 
of  lead  paint  may  differ  according  to  some  physical  characteristic  of  the 
unit  (age,  for  example)  or  the  objectives  of  the  survey  may  require  that 
incidence  of  lead  paint  be  estimated  separately  for  different  tenancy 
groups  (public,  subsidized,  owner-occupied,  etc.). 

You,  the  planners  and  administrators  of  the  lead  paint  survey,  should 
have  an  understanding  of  the  sampling  process  even  if  you  plan  to  have 
a  statistician  handle  this  portion. 

A  basic  statistical  concept  which  underlies  the  remainder  of  this 
section  (and  Appendix  A)  is  that  of  a  binomial  distribution.  If  a  charact- 
eristic of  interest  is  defined  in  terms  of  yes -no,  true -false,  or  hazardous  - 
non- hazardous ,  the  characteristic  is  binomially  distributed.  The  binomial 
distribution  accurately  reflects  the  ways  in  which  laws  or  codes  are 
usually  written  and  enforced.  A  typical  code  might  require  that  a  surface 
with  more  than  2.0  mg/cm^  of  lead  be  deleaded  according  to  some  acceptable 
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procedures  but  would  not  require  distinct  action  for  each  lead  level, 
such  as  2.5,  5.0,  or  10.00  mg/cm^.  The  binomial  distribution  does  not 
accommodate  the  notion  of  degrees  or  gradation  of  hazard,  but  can  only 
distinguish  between  hazardous  and  non-hazardous  levels. 
Sample  Accuracy 

'There  are  many  ways  in  which  inaccuracies  may  be  introduced  into  the 
estimates  resulting  from  a  survey.  Errors  are  of  two  general  types:  those 
which  arise  because  of  the  use  of  a  sample  to  present  an  entire  population 
and  those  which  arise  from  the  data  collection  and/or  analysis  procedures 
and  should  be  expected  to  occur  even  if  a  complete  census  were  taken.  Errors 
of  both  types  will  occur;  they  cannot  be  completely  eliminated.  They  must  be 
controlled  to  the  extent  that  the  error  is  either  negligible  or  predictable. 
Errors  introduced  by  the  use  of  a  sample  can  be  estimated  and  can  be 
controlled  to  produce  any  accuracy  required  by  use  of  the  methods  of 
statistics.  The  effect  of  errors  introduced  from  sources  other  than  the 
sampling  procedure  is  not  as  easy  to  predict;  if  however,  the  number  of 
such  errors  is  kept  small,  their  effect  will  be  negligible.  The  term 
"accuracy"  refers  to  all  errors;  the  term  "precision"  refers  only  to 
sampling  errors  (the  more  precise,  the  smaller  the  error). 
Sample  Size 

If  the  size  of  the  sample  (hereafter  denoted  by  n)  is  close  to  the 
number  of  dwelling  units  (called  the  total  population  and  denoted  by  N) , 
precision,  as  one  would  expect,  is  very  good.  However,  in  almost  all 
practical  situations  the  sample  may  be  quite  small  relative  to  the  total 
population.  The  sample  size  is  the  dominant  factor  in  the  statistical 
precision  of  the  sample;  the  size  of  the  total  population  does  not 
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significantly  affect  the  precision!/.  The  Gallup  and 

example,  interview  only  about  1500  persons  to  represent  nationwide  opinion. 

Maximum  efficiency,  in  terms  of  precision  per  sample  unit,  can  be 
attained  by  the  use  of  a  simple  random  sample  2/.  The  basic  principle 
that  produces  accurate  estimates  from  small  samples  is  that  each  unit  of  the 
total  population  must  be  equally  likely  to  appear  in  the  sample. 

There  is  some  element  of  risk  involved  in  accepting  the  findings 
from  any  sample  as  representative  of  the  total  population.  Tables  1 
and  2  illustrate  the  magnitude  of  this  risk.  If  a  very  large  number  of 
samples  of  the  indicated  size  were  drawn  from  an  infinite  population,  95 
percent  (Table  1)  of  these  samples  would  contain  the  true  percent  of 
hazardous  units  of  the  total  population  within  the  given  confidence  intervals. 
For  any  given  sample,  there  is  a  five  percent  risk  that  the  true  value  is 
not  contained  within  the  given  confidence  intervals.  For  example,  if  the 
percentage  of  hazardous  dwellings  in  the  sample  is  10  percent,  a  sample 
size  of  100  insures  that  there  are  95  chances  out  of  100  that  the  per- 
centage of  hazardous  dwellings  for  the  total  population  lies  between 
4.1  and  15.9  (Table  1),  whereas  there  are  90  chances  out  of  100  that  the 
true  percentage  of  hazardous  dwellings  lies  between  5.0  and  15.0  (Table  2). 

Note  that  an  increase  in  sample  size  (for  a  given  percentage  of 

hazardous  dwellings)  always  improves  the  precision,  but  the  improvement 

1/  This  assumes  the  total  population  is  "large"  relative  to  the  sample. 
See  Appendix  A  for  procedures  appropriate  for 'small  populations. 

2/  Other  sampling  schemes  beyond  the  scope  of  this  manual  may  do  as  well 

in  terms  of  precision  per  sample  unit  and  better  in  terms  of  precision 
per  dollar  spent.  Cluster  sampling  for  example,  will  lower  the  unit  cost 
of  inspection,  but  will  require  a  larger  sample  to  achieve  the  same  pre- 
cision. Stratified  samples  may  be  considered  as  combinations  of  simple 
random  samples.  These  more  sophisticated  techniques  may  be  appropriate 
for  particular  objectives. 
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is  only  modest  in  going  from  a  reasonably  large  sample  to  a  still  larger 
one.  For  example,  in  Table  1,  for  50  percent  hazardous  units,  doubling 
the  sample  size  from  50  to  100  narrows  the  confidence  interval  by  8.2 
percentage  points  wheras  doubling  the  sample  size  from  500  to  1000  narrows 
the  confidence  interval  by  only  2.4  percentage  points.  Note  that  the  case 
in  which  50  percent  of  the  units  are  hazardous  yields  larger  confidence 
intervals  than  any  other  case  for  a  given  sample  size.  Thus  it  represents  the 
worst  case  for  a  given  sample  size.  Note  also  that  the  magnitude  of  the 
confidence  interval  when  m  percent  of  the  units  are  hazardous  is  equal  to 
that  for  (100-m)  percent,  and  becomes  smaller  as  m  approaches  zero  or  100. 

There  is  no  "right"  confidence  level.  The  particular  level  to 
aim  for  -  951,  901,  or  some  other  level  -  is  a  policy  decision  which 
must  be  based  on  the  following  factors: 

1.  The  cost  of  performing  the  survey  at  various  sample  sizes. 

2.  The  confidence  interval  which  is  required.  This  is  a  threshold 
determination  problem;  presumably  a  different  course  of  action 
will  be  undertaken  depending  on  whether  the  percentage  of  hazard- 
ous units  is  greater  or  less  than  ml .  How  precisely  must  m  be 
known  in  order  to  make  a  good  decision?  What  are  the  consequences 
if  m  is  off  by  5%  or  by  10%,  etc.? 

3.  The  risk  involved  for  example,  in  adopting  a  course  of  action 
based  on  a  901  confidence  level  rather  than  on  a  95%  or  991  one. 

If  alternative  courses  of  action  differ  greatly  in  cost  (whether  in 
dollars,  time, difficulty,  or  whatever),  the  acceptable  risk  should 
be  small. 
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Non- Sampling  Errors 

Non- sampling  errors  may  be  either  conceptual  or  mechanical.  There 
are  no  statistical  formulae  which  can  predict  how  seriously  these  errors 
will  affect  the  survey  results.  Such  errors  cannot  be  totally  avoided  or 
controlled,  and  great  care  at  every  stage  of  the  survey  is  essential  to 
minimize  them.   This  care  includes,  of  course,  conscientious  performance 
of  all  tasks;  it  also  requires  the  development  and  implementation  of  procedures 
at  extra  cost,  for  assuring  that  the  number  of  errors  is  small.  Your 
consulting  statistician  can  furnish  valuable  advice  based  on  your  specific 
objectives  and  circumstances.  Here  are  some  of  the  non -sampling  errors 
which  are  likely  to  arise  in  a  lead  paint  hazard  survey: 

1)  Non-random  sampling 

Any  factor  which  introduces  an  unknown  bias  into  the  sample  can 
introduce  errors  into  the  findings.  For  example,  use  of  a  telephone 
directory  for  the  selection  of  households  would  preclude  selection  of 
households  with  no  telephone  and  those  with  a  unlisted  telephone  number, 
and  would  offer  a  multiple  chance  of  selection  of  households  with  more 
than  one  telephone. 

2)  Inaccuracy  of  listing 

If  the  list  from  which  the  sample  is  drawn  is  incomplete  or 
seriously  out  of  date,  errors  will  be  introduced  into  the  findings.  For 
example,  a  city  directory  cannot  be  used  to  select  a  sample  to 
represent  a  metropolitan  (city  and  suburbs)  area. 

3)  Survey  questionnaire  ambiguities 

The  data  collection  procedure  must  be  completely  reproducible:  that 
is  to  say,  any  team  of  inspectors  should  produce  equivalent  data  for  the 
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same  housing  unit.  This  requires  a  good  form  design  and  well-trained 
inspectors . 

A  common  procedure  in  many  household  surveys  is  to  re -interview  a 
sample  of  the  respondents.  Re- inspection  is  not  recommended  for  a  lead  paint 
survey  for  the  following  reasons.  First,  the  burden  on  the  occupant  (time) 
and  the  intrusion  into  his  privacy  (each  room  of  the  dwelling  must  be 
entered  and  inspected)  is  greater  than  in  the  usual  survey.  Second,  to 
include  enough  information  on  the  data  collection  form  to  uniquely  identify 
each  surface  and  room  of  the  dwelling  (for  re -measurement)  requires  a  signi- 
ficant increase  in  the  volume  of  data  collected,  and  these  extra  data  have  no 
intrinsic  value.  A  post- inspection  interview,  performed  by  the  supervisors, 
can  and  should  be  done  for  some  fraction  of  the  respondents.  This  can  be 
either  a  doorway  or  telephone  interview.  This  should  include  verifying 
all  of  the  questionnaire  information  supplied  by  the  respondent.  Also 
it  should •  include  questions  such  as: 

When  did  the  inspectors  arrive? 
How  long  did  the  inspection  take? 
Were  all  rooms  of  the  unit  inspected? 

The  inspectors  will  be  aware  of  the  post- inspection  interview;  thus 
it  has  value  as  a  supervisory  device  as  well  as  checking  on  the  consistency 
of  the  respondents. 

4)  Detection  instrument  errors 

This  is  one  of  the  errors  peculiar  to  a  lead  paint  survey.  The 
detection  devices  are  simply  not  as  accurate  and  reliable  as  one  would 
like.  This  source  of  error  cannot  be  eliminated,  but  the  error  can  be 
minimized  by  use  of  proper  calibration  procedures  and  good  inspector 
training.  23 


5)  Inspector  performance 
Most  of  the  data  gathered  in  the  survey  are  instrument  readings;  the 

remainder  are  simple  "yes -no"  or  answers  to  multiple  choice  questions,  so 
there  is  somewhat  less  chance  for  the  inspector  to  "lead"  the  respondent  into 
the  "right"  answer.  The  consistency  and  reliability  of  the  inspectors  in 
making  all  of  the  appropriate  and  necessary  measurements  of  painted 
surfaces,  in  the  inspected  dwellings,  is  of  major  importance  in  minimizing 
survey  errors.  In  addition  to  diligence,  they  must  be  accurate  in 
recording  lead  measurements  and  other  pertinent  information  on  the  survey 
questionnaire. 

6)  Respondent  accuracy 
This  is  related  to  inspector  performance  and  questionnaire  ambiguities. 

This  factor  can  be  a  serious  problem  in  a  general  household  survey  but  is 
less  likely  to  be  serious  in  a  lead  paint  survey. 

7)  Mechanical  and  clerical  errors 
These  errors  will  occur;  they  include  transcription  and  coding  errors 

(including  omissions),  key  punch  errors,  illegibility,  loss, destruction  or 
mutilation  of  data  collection  forms  and  arithmetic  errors  in  manual  tabulation. 
Although  they  will  presumably  be  random  and  therefore  will  tend  to  cancel 
each  other,  all  reasonable  precautions  to  avoid  these  errors  should  be 
taken.  Such  precautions  should  include  keypunching  directly  from  the 
data  collection  form  (no  transcription  errors  can  occur  if  transcription 
is  avoided),  verifying  all  card  punching,  close  monitoring  of  all  forms 
by  supervisors,  etc. 

8)  Non-existent  unit 
A  dwelling  unit  selected  into  the  sample  may  not  exist.  This  could 
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be  due  to  an  address  error,  demolition  of  a  unit,  conversion  to  non- 
residential use,  etc. 

9)  Non- responses 

Inevitably,  entry  to  some  dwellings  will  be  impossible.  The  unit 
may  be  vacant,  the  occupant  may  not  be  at  home  or  the  occupant  may  refuse 
entry  to  the  inspectors.  Every  effort  should  be  made  to  keep  non-responses 
to  a  minimum,  but  for  each  non-response,  those  data  which  can  be  obtained  by 
observation  should  be  collected.  These  data  are  useful  in  detecting  any 
systematic  bias  induced  by  non-responses.  The  actions  required  differ 
according  to  the  cause  of  the  non-response.  For  non-existent,  vacant, 
or  non-residence  units,  nothing  can  be  done;  for  not-at-homes,  efforts 
should  be  made  to  reschedule;  for  refusals,  rescheduling  should  be 
attempted  and  the  occupant's  reason  for  refusal  should  be  determined  -- 
this  information  may  be  useful  in  retraining,  pairing  of  inspection 
teams,  etc. 

PROPER  TIMING  FOR  THE  SURVEY 

After  you  have  established  the  objectives,  the  number  of  units  to 
be  inspected,  and  the  data  to  be  collected  for  each  unit,  planning  for 
the  operational  part  of  the  survey  can  begin. 

Our  experience  in  surveys  for  lead  paint  has  shown  that  the 
project  is  most  efficiently  performed  during  the  summer.   There  are 
no  slushy  or  icy  streets  and  no  heavy  coats  and  boots  to  encumber  the 
inspectors,  and  most  importantly,  householders  are  more  likely  to  permit 
inspection  in  good  weather.  In  some  temperate  climates,  the  season  may 
be  of  little  importance  in  choosing  the  period  for  the  survey. 

College  students  who  plan  to  return  to  school  in  the  fall  tend  to  work 
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faster  than  other  inspectors  who  know  that  the  sooner  the  project  ends, 
the  sooner  they  will  be  unemployed.  During  one  survey,  for  example, 
student  teams  completed  an  average  of  5.5  units  inspections  a  day  while 
other  teams  averaged  only  4  units. 

There  are,  of  course,  administrative  and  managerial  problems 
involved  in  such  a  concentrated  effort.  The  administrator  may  be 
running  several  other  programs  and  have  only  a  limited  amount  of 
time  to  spend  on  the  lead  paint  survey.  Money  may  be  tightly  budgeted. 
But  it  is  most  efficient,  in  the  long  run,  to  hire  as  many  inspectors 
as  is  necessary  to  complete  the  survey  during  the  summer. 

CHARACTERISTICS  TO  LOOK  FOR  IN  HIRING  INSPECTORS 

Experience  in  previous  surveys  has  shown  that  the  teams  most 
successful  in  being  accepted  by  housing  occupants,  were  made  up  of  a  male 
and  a  female;  or  a  black  and  a  white.  You  should,  therefore,  consider 
these  public  preferences  in  hiring. 

While  enthusiasm  and  courtesy  are  extremely  important,  a  certain 
adaptability  and  resilience  are  indispensable  in  dealing  with  the  public. 
Inspectors  should  not  overreact  for  example,  if  an  occupant  answers  the 
door  clad  only  in  a  wristwatch,  as  has  happened. 

One  requirement  for  employment  may  have  to  be  ownership  of  a  car. 
An  inspection  team  must  have  a  car  to  get  to  its  daily  assignments.  If 
the  sponsoring  agency  cannot  provide  each  team  with  a  vehicle,  then  the 
inspectors  will  have  to  use  their  personal  cars  and  be  reimbursed 
for  expenses  as  agreed.  The  inspectors  should  have  a  general  knowledge 
of  the  city  and  be  able  to  get  around  in  it. 

The  prospective  inspectors  should  also  understand,  when  they  are 
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hired,  that  they  may  be  required  to  work  irregular  shifts  and/or  Saturdays. 
Since  a  great  many  occupants  have  full-time  jobs,  they  are  not  available 
between  9  AM  and  5  PM,  to  open  their  homes  for  inspection. 

Also  remember,  in  hiring,  that  the  lead  detection  equipment  weighs 
approximately  twenty  pounds.  Your  inspectors  will  have  to  be  able  to 
carry  this  weight  a  good  part  of  the  day. 

SUPERVISOR  TRAINING 

While  the  publicity  campaign  (page  34)  is  still  underway  and  approx- 
imately ten  to  fifteen  days  before  the  inspectors  come  on  board,  the 
supervisors  should  begin  training  for  the  survey.  If  they  are  not  part 
of  the  department  already,  and  are  not  familiar  with  the  lead  paint  problem, 
they  must  be  completely  familiarized  with  the  project,  besides  learning 
all  the  particulars  of  running  the  survey. 

Since  the  supervisors  will  be  leading  the  training  sessions  for  the 
inspectors  they  will  have  to  learn  how  to  operate  the  XRF  lead  detector, 
how  to  complete  the  survey  forms,  and  generally  how  to  conduct  the  inspection. 
This  includes  learning  all  about  the  various  housing  materials  and  conditions, 
so  that  they  can  instruct  the  inspectors  in  filling  out  the  survey  form 
(Figure  1)  and  check  the  completed  forms  for  accuracy.   (This  is  why  people 
with  experience  in  housing  were  suggested  for  this  job.) 

The  pretest  (a  mini- survey  held  in  advance  to  check  for  problems  in 
the  survey  plan)  would  most  conveniently  be  held  at  this  time.  It's  an 
important  testing  ground  for  all  the  tasks  that  the  supervisors  must  learn  and 
later  teach  the  inspectors.  During  this  period  the  supervisors  gain  first- 
hand experience  of  the  actual  survey  tasks  as  well  as  insight  into  the 
human  problems  that  are  likely  to  be  encountered  during  the  survey. 
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During  this  pre -survey  period,  the  supervisors  can  help  the  project 
administrators  make  certain  decisions- -for  instance,  whether  or  not  automobiles 
used  during  the  survey  should  be  identified.  This  may  seem  trivial  but  could 
make  a  difference  in  participation.  In  one  instance,  a  rat  elimination 
program  was  being  conducted  simultaneously  with  a  lead  paint  survey. 
Many  people  refused  entry  to  the  lead  inspectors  for  fear  their  neighbors 
would  think  the  department  car  was  there  because  of  rats.  Obvious 
identification  would  also  help  to  discourage  any  sort  of  impersonation 
for  criminal  purposes. 

Another  question  is  whether  it  is  best  to  saturate  a  neighborhood 
with  all  the  teams  at  once  or  assign  just  one  group  to  a  neighborhood. 
One  team  can  become  thoroughly  familiar  with  an  area  and  the  residents 
can  become  comfortable  with  their  presence.   On  the  other  hand,  the 
area  saturation  approach  allows  closer  supervision. 

Inspectors  have  to  be  properly  identified.  Photographs  for  identi- 
fication badges  should  be  made  of  them  as  soon  as  they  are  hired  so  that 
everything  will  be  ready  by  the  first  inspection.  Residents  may  deny 
entry  to  anyone  who  is  not  properly  identified. 

The  local  police  should  be  informed  that  the  survey  is  being  performed 
and  told  what  sort  of  identification  is  used  by  the  inspectors. 

The  project  administrator  and  supervisors  have  other  administrative 
details  to  discuss  as  well,  though  final  decisions  need  not  be  made 
until  further  into  the  project.  For  instance,  when  the  inspections 
begin,  the  supervisors  will  want  to  meet  with  all  the  inspectors 
daily.  These  meetings  have  two  purposes:  first  is  the  training  value  of 
sharing  inspection  experiences,  finding  gaps  or  ambiguities  in  training 
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or  questionnaires,  etc.  The  second  is  to  permit  the  supervisors  to  collect 
and  spot -check  questionnaires  and  to  make  new  assignments.  After  the  learning 
value  diminishes,  these  meetings  could  be  reduced  to  once  a  week  and  each 
inspection  team  could  operate  more  autonomously.  The  XRF's  can  be  taken 
home  by  the  inspectors  for  calibration  and  recharging  and  considerable 
time  can  be  saved.  Such  short  cuts  and  efficiencies  will  depend  on  the 
policy  of  the  agency  and  on  the  personal  motivation  of  the  inspectors , 
which  can  only  be  determined  after  the  project  is  underway. 

INSPECTOR  TRAINING 

Approximately  one  week  before  the  survey  is  to  begin,  the  inspectors 
should  be  brought  on  board.  Normally,  training  should  be  a  relatively 
simple  operation,  requiring  only  about  two  days  (4  half-day  sessions) . 
It  can,  of  course,  take  longer  if  detailed  departmental  policy  must  be 
presented  or  additional  "dry-runs"  are  required. 

The  first  morning  of  the  two -day  training  should  be  spent  in  intro- 
ducing the  lead  paint  poisoning  problem  to  the  inspectors  and  explaining 
to  them  how  the  housing  survey  will  help  in  dealing  with  it.  This  is  an 
extremely  important  portion  of  the  training.  The  inspectors  must  realize  that 
they  are  participating  in  a  meaningful  activity  so  that  they  will  give 
their  best  efforts  to  it.  Their  enthusiam  or  lack  of  enthusiasm  for  the 
program  will  be  perceived  by  occupants,  who  will  react  accordingly.  The 
inspectors  must  also  gain  enough  knowledge  Of  lead  poisoning,  sampling, 
and  XRF  characteristics  to  answer  the  expected  questions  from  the  householder 
with  accuracy  and  confidence.  Also  during  this  first  session,  the  sampling 
technique  should  be  explained  to  the  group.  They  should  know  how  each 
dwelling  unit  was  chosen  and  the  importance  of  inspecting  each  and  every  one. 
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The  lead  detection  equipment  should  be  described  at  the  next  session. 
Instruction  should  be  given  on  its  operation,  the  meaning  of  its  readings 
and  especially  how  the  instruments  are  calibrated.  The  importance  of 
recording  the  readings  from  the  calibration  dials  should  be  stressed.  It 
shouldybe  emphasized  that  the  lead  detection  equipment  is  delicate  and 
should  be  handled  carefully,  not  jostled  around  or  thrown  in  car  trunks. 

Instruction  in  filling  out  the  inspection  form  (see  Appendix  B) 
will  require  most  of  the  third  training  session.  Distinguishing 
various  room  conditions  as  described  under  Item  XI  of  the  Questionnaire 
needs  major  emphasis.  Photographs  or  color  slides  which  demonstrate  these 
conditions  would  be  most  helpful.  Instructions  in  how  to  move  about  the 
dwelling  unit,  and  how  to  record  the  readings  can  also  be  given.  The 
possibility  of  "read  through"  (detecting  lead  on  another  surface  behind 
the  one  being  read)  should  be  explained.  The  inspector's  responsibility 
for  equipment,  procedures,  schedules,  and  the  need  for  courtesy  and  con- 
scientiousness should  be  emphasized  at  this  point. 

To  close  out  the  training  session,  each  team  should  carry  out  an 
actual  inspection  in  at  least  one  test  dwelling  unit.  Supervisors  should 
take  turns  acting  as  an  occupant  who  is  as  obstinate  and  uncooperative  as 
possible.  This  will  give  the  inspectors  experience  in  dealing  with 
difficult  people. 

From  experience,  we  have  found  that  several  practice  inspection 
sessions  for  each  team  improves  performance.  Emphasis  on  proper  classi- 
fications for  wall  materials,  conditions,  and  type  of  home  construction 
will  result  in  more  consistent  and  accurate  completion  of  the  forms. 

The  final  session  includes  instructions  in  the  proper  approach  to 
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use  in  meeting  occupants.  The  importance  of  this  is  obvious.  If  the 
resident  is  offended  or  irritated  by  the  way  the  inspectors  present  them- 
selves, they  may  be  refused  entry. 

The  survey  team  should  present  itself  in  a  courteous  manner.  One 
team  member  should  immediately  set  up  the  lead  detection  equipment  while 
the  second  begins  completing  the  form  by  asking  the  occupant  for  the 
necessary  information.  The  inspectors  should  make  every  effort  to  have 
the  occupant  accompany  them  during  the  inspection.  As  the  inspectors 
proceed  through  the  dwelling  they  should  answer  any  questions  posed  by 
the  occupant  courteously,  truthfully,  and  briefly.  A  dwelling  unit 
inspection  should  generally  require  thirty  minutes  or  less.  The  team 
should  keep  this  in  mind  and  move  through  the  task  quickly. 

Inspectors  should  check  the  exteriors  of  vacant  units  or  units 
whose  residents  have  refused  entry.  They  should  complete  the  form  for 
those  characteristics  which  are  observable  from  outside  the  unit.  This 
information  can  be  used  to  determine  whether  or  not  there  is  a  pattern 
to  the  visible  characteristics  of  units  to  which  entry  has  been  refused. 
For  example,  if  the  refusals  have  come  primarily  from  residents  of  units 
built  before  1940,  that  would  clearly  bias  the  survey  results. 

After  the  first  day  of  field  work,  the  survey  teams  can  be  assembled 
for  a  general  group  discussion  on  experiences  and  problems  encountered. 
At  this  time  the  forms  completed  by  each  team  can  be  examined  for 
consistency.  Immediate  feedback  to  the  inspectors  is  iinportant  so  that 
errors  can  be  corrected  while  the  day's  activities  are  still  fresh  in 
their  minds.  This  session  should  be  used  to  identify  frequently  occuring 
questions  or  attitudes  of  the  residents,  and  the  abilities  of  the  inspectors 
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to  cope  with  them.  It  should  help  in  finding  gaps  in  the  training  program 
and  in  establishing  standard  answers  to  common  questions. 

Supervisors  should  randomly  select  one  of  each  team's  completed 
forms,  on  a  daily  basis,  and  check  it  for  completeness  and  consistency. 
This  procedure  results  in  correctly  completed  forms  in  almost  all  cases. 

If  new  inspectors  must  be  hired  during  the  survey,  very  little 
formal  training  is  required.  The  background  and  organizational 
orientation  (first  session  of  the  full  program)  should  be  repeated.  Each 
new  inspector  should  then  be  paired  with  one  of  the  experienced  inspectors. 

THE  SURVEY  QUESTIONNAIRE 

The  lead  paint  survey  questionnaire  developed  by  NBS  is  shown  in 
Figure  1.  Detailed  instructions  for  its  use  are  given  in  Appendix  B. 
You  may  choose  to  collect  data  that  differ  from  those  appearing  on 
the  sample  questionnaire.  In  any  case,  it  is  important  to  exercise  fore- 
sight in  designing  the  data  collection  form  to  be  used  in  the  survey. 

Redundancy  should  be  designed  into  the  survey  questionnaire  to  check 
the  consistency  of  completed  forms.  For  example,  in  the  sample  form, 
Outside  Surface  of  Building  (XS)  should  be  the  same  as  the  material 
code  in  COND  or  the  EXTERIOR  section  (see  Appendix  B) .  Some  of  the 
data  may  be  checked  for  consistency  as  well.  Examples  are  blank  column 
checks,  check  for  exterior  readings,  check  for  a  kitchen,  etc.  Again, 
the  pretest  is  an  important  preview  of  the  workability  of  the  survey 
form. 

The  primary  purpose  of  the  lead  paint  survey  is,  of  course,  to 
determine  the  incidence  of  lead  paint  in  the  homes  in  your  community. 
A  survey  such  as  this  is  expensive,  in  terms  of  dollars  and  time  spent, 
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but  the  incremental  cost  of  gathering  additional  information  may  be  small 
by  comparison  1/.  Therefore  every  consideration  should  be  given  to  the 
collection  of  supplementary  or  incidental  data  which  might  be  valuable 
to  your  department  or  to  other  departments.  The  criteria  for  deciding 
on  the  inclusion  of  additional  observations  should  be  that  (1)  they  do 
not  jeopardize  the  primary  survey  objectives  and  (2)  the  value  of  the 
additional  information  exceeds  the  cost  of  gathering  it. 

PUBLICITY  CAMPAIGN 
Our  experience  has  shown  that,  for  the  most  part,  the  more  informed 
people  are  about  a  lead  paint  survey,  the  more  cooperative  they  are. 
For  this  reason  a  publicity  campaign  should  be  started  two  weeks  to  a 
month  before  inspections  begin.  The  dangers  of  lead  paint  poisoning, 
the  reasons  for  the  lead  paint  survey,  and  the  potential  benefits  of 
the  survey  to  the  community  should  be  explained.  Mention  that  the 
project  is  being  conducted  in  all  sections  of  the  city,  in  high  and  low 
income  areas,  and  that  the  lead  paint  detector  will  not  harm  the  occupants 
or  their  possessions.  Make  the  message  simple  and  stress  the  positive 
aspects  of  the  program.  Try  to  prevent  rumors  before  they  begin.  If 
someone  believes  that  a  penalty  will  be  imposed  or  personal  costs  will 
be  incurred  if  lead  paint  is  found  in  his  home,  he  will  not  welcome  the 
inspectors.  Be  sure  to  mention  that  each  home  has  been  randomly  selected 
(both  those  with  and  without  children)  and  that  the  occupants  will  receive 


1/Remember  that  there  is  a  two  person  team.  The  inspection  requires  100% 
"  of  the  XRF  user's  time  but  only  a  fraction  of  the  recorder's.  Thus 
almost  any  yes -no  or  multiple  choice  observation  about  the  unit  is  free 
with  respect  to  inspection  cost.  The  only  costs  are  in  training  and  in 
data  handling.  When  questions  requiring  subjective  answers  or  opinions 
are  put  to  respondents,  things  can  become  quite  complicated.  This  is 
not  recommended  as  part  of  a  lead  survey. 
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a  letter  indicating  that  their  homes  have  been  chosen. 

Some  surveying  organizations  have  expressed  concern  that  an  extensive 
publicity  campaign  might  create  security  problems  such  as  criminal  imper- 
sonation of  the  inspectors  to  gain  entry  to  the  dwelling  units.  This  risk 
can  be  minimized  by  advertising  that  official  letters  will  be  sent  to  the 
chosen  homes  prior  to  the  survey,  by  stating  in  the  letter  that  a  telephone 
appointment  will  be  made  and  finally  by  assuring  that  the  inspectors  appear 
at  the  appointed  time  and  are  properly  identified. 

Independent  studies  have  shown  that  information  received  through 
television,  radio  and  newspapers  is  more  readily  accepted  than  informa- 
tion received  from  other  sources.  So  it  is  important  to  solicit  the  cooper- 
ation of  these  media  from  the  start.  Public  service  announcements, 
informing  the  public  of  the  survey,  are  invaluable.   Since  public 
service  television  time  is  the  hardest  to  obtain,  spot  TV  messages  can  be 
used,  with  more  detail  being  offered  in  radio  announcements  and  in  the 
newspapers . 

It  is  a  good  idea  to  compose  the  introductory  letter  (mentioned 
above)  at  this  time  so  that  copies  can  be  ready  when  they  are  needed. 
The  letter  should  be  simply  and  clearly  written  and  include  a  statement 
of  purpose,  the  number  of  homes  selected  and  the  personnel,  time  and 
equipment  involved.  The  introductory  letter  shown  in  Figure  2  was 
used  in  a  survey  in  which  all  inspections  were  by  appointment.  The 
occupant  was  subsequently  contacted  by  telephone  to  arrange  an  inspection 
time  and  answer  questions.  I  fVnon- appointment  inspections  are  to  be 
made,  an  information  telephone  number  should  be  included.  As  with  the 
rest  of  the  publicity  campaign,  it  is  best  to  stress  the  positive  rather 

35 


Figure  2 

SAMPLE 
INTRODUCTORY  LETTER 


Dear  "(Your  Jurisdiction)  Citizen: 


The  (Your  Department)  is  conducting  a  study  to  determine  the  amount 
of  lead  paint  on  the  walls  of  your  home.  High  lead  levels  in  paint 
may  be  a  health  hazard  especially  for  young  children.  The  study  will 
be  conducted  in  homes  where  there  are  no  children  as  well  as  in  homes 
with  children. 

(Your  Number)  homes  have  been  chosen  as  a  cross  section  to  represent 
all  neighborhoods  and  types  of  homes  in  (Your  Jurisdiction) .  Your 
home  is  among  those  randomly  selected  for  this  study 

Within  the  next  two  weeks,  you  will  be  contacted  by  phone  to  arrange 
a  mutually  convenient  time  for  two  (Your  Department)  staff  members, 
bearing  identification  cards,  to  visit  your  home.  They  will  request 
your  permission  to  take  measurements  of  the  lead  content  of  the  paint 
on  interior  walls. and  woodwork,  as  well  as  painted  exterior  surfaces. 
The  measurements  will  take  about  one -half  hour.  They  will  use  a 
portable  lead  detector  that  is  safe  for  you  and  your  possessions, 
uses  no  chemicals,  and  does  not  harm  or  mar  the  surfaces  measured. 

Thank  you  for  your  cooperation  in  this  important  project. 

Sincerely, 


(Name) 
(Title) 


36 


than  the  negative  aspects  of  the  project.  Letters  which  do  not  dwell  on 
the  health  hazards  seem  to  be  the  most  effective.  Letters  on  official 
stationary  carrying  the  signature  of  the  best  known  department  official 
will  earn  the  greatest  credibility  and  cooperation. 

PRETEST 

Before  beginning  the  actual  housing  survey,  it  is  beneficial,  if  not 
absolutely  necessary,  to  first  have  a  pretest- -that  is,  to  select  a  small 
number  of  dwelling  (additional  to  the  sample)  and  inspect  them,  using  the 
supervisors  and  administrators  as  inspectors.  This  exercise  will  highlight 
problems  in  the  data  collection  form,  if  there  are  any,  and  will  give  the 
supervisors  experience  with  the  situations  that  the  inspectors  will  face 
during  the  actual  survey.  Several  estimations  made  prior  to  the  survey 
can  be  validated  during  a  pretest.  Survey  costs  and  budgeting  can  be 
redetermined.  The  pretest  is  the  time  to  make  final  adjustments.  Changes 
made  as  a  result  of  the  pretest  will  make  for  a  more  efficient  housing 
survey. 

RECORD  KEEPING 

As  the  units  are  selected  from  the  directory,  the  necessary  informa- 
tion should  be  transferred  to  file  cards.  Make  two  cards  for  each  dwelling 
and  file  them  separately- -one  in  a  control  file  for  the  project  adminis- 
trator and  the  other  for  the  supervisor's  or  project  manager's  work  file. 
For  efficient  control  and  retrieval  the  cards  should  be  filed  according  to 
census  tract.  The  census  tract  number  plus  a  serial  number  will  serve  to 
distinguish  each  dwelling  unit  from  all  others  in  the  tract  and  the  total 
sample.  For  instance,  if  there  are  300  houses  in  census  tract  501,  the 
card  will  be  numbered  501001;  501002,...,  501300. 
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Some  supervisors  have  found  it  efficient  to  maintain  several  files 
or  several  sections  within  a  file,  in  order  to  divide  the  dwellings  according 
to  their  survey  status:  completed  units;  refusals,  not -at -home;  vacant 
units,  etc.  This  helps  the  supervisors  to  determine  the  status  of  the 
project --how  many  have  been  inspected,  which  homes  have  to  be  rescheduled, 
and  how  many  additional  sample  sets  have  to  be  drawn. 


Address : 

Name  of  Occupant: 

Final  Insp.  Date: 


Identification  No. 


Phone : 


Names  of  Insp, 
Team  Members: 


Comp.    NDU   Ref.    NSA   Vac.    UC 
No.  and  Ages  of  Children  in  Residence: 


Demo. 


Comp.  -  completed 

NDU  -  not  a  dwelling  unit 

Ref.  -  refusal 

NSA  -  no  such  address 


Vac.  -  vacant 

UC  -  under  construction 

Demo.  -  demolished 


NOTIFICATION  OF  RESIDENTS 
Approximately  five  to  ten  days  in  advance  of  the  first  inspection, 
introductory  letters  should  be  sent  to  the  group  of  dwelling  units  to  be 
inspected  during  the  first  week.  Addressing  the  letters  to  "Occupant" 
will  save  a  lot  of  time  in  the  long  run.  After  the  initial  mailing, 
letters  should  be  sent  out  frequently  enough  to  ensure  having  a  group  of 
notified  residents  with  whom  telephone  appointments  can  be  made. 
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There  are  advantages  and  disadvantages  to  telephone  scheduling.  It  . 
definitely  saves  inspectors'  time  and  it  serves  to  confirm  the  intro- 
ductory letter,  thereby  giving  residents  a  greater  sense  of  security  in 
allowing  the  inspectors  to  enter  their  homes.  However,  telephoning  also  gives 
the  individual  an  opportunity  to  refuse  inspection.  Some  inspectors  are 
more  successful  in  gaining  entry  when  they  arrive  at  a  dwelling  unannounced. 
For  this  reason,  they  prefer  the  "cold-call"  method  (no  additional 
notification  after  the  letter) . 

If  telephone  scheduling  is  used,  appointments  should  be  arranged  a  day 
in  advance.  An  approach  like  the  following  has  been  used  successfully  for 
this  purpose: 

Caller:  Hello.  This  is  (caller's  name)  from  the  (your  department) 

calling.  Did  you  receive  the  letter  about  the  Lead  Paint  Program? 

Occupant:  Why  yes,  I  did. 

Caller:  Good.  Then  would  it  be  convenient  with  you  if  our  inspectors 
came  by  about  10  o'clock  tomorrow  morning? 

Occupant:  I  guess  so. 

Caller:  Fine,  they  will  be  there  at  10:00  a.m.  tomorrow.  Thank  you 
very  much. 

Telephone  refusals  should  be  anticipated.  However  you  should  try 
to  convince  the  occupant  to  permit  an  inspection.  You  can  prepare  a 
list  of  common  reasons  for  refusal  and  develop  effective  arguments  to 
counteract  them.  Typical  reasons  for  refusal  are: 

1)  Suspicion  of  governmental  inspection  of  their  residences; 

2)  Doubt  of  the  study's  importance  and  intent; 

3)  Would  not  be  at  home  for  the  inspection; 
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4)  Have  no  children  and  fail  to  see  why  you  wish  to  include  them 
as  participants; 

5)  Have  just  painted  their  dwelling  with  non- leaded  paint. 

It  may  be  necessary  for  the  scheduler  to  spend  some  evenings  and/or 
Saturdays  making  appointments  with  people  who  are  not  at  home  any  other 
time. 

Every  attempt  should  be  made  to  minimize  refusals.  However,  if  they 
cannot  be  avoided,  additional  samples  can  be  drawn  to  make  up  the  deficit. 
If  there  is  a  pattern  to  the  refusals- -for  example,  everyone  with  a  house 
built  before  1940  denies  entry- -this  will  bias  the  findings  1/.  For  this 
reason,  it  is  a  good  idea  to  keep  records  on  the  types  of  houses  from 
which  you  are  being  barred.  The  survey  statistician  will  use  this  infor- 
mation in  the  reconciliation  of  population  data  source -sample -survey 
requirements.  Fortunately,  our  previous  lead  paint  surveys  have  not 
encountered  such  biases. 

The  self -addressed,  stamped  call -back  card  (Figure  3)  should  be  left 
at  the  door  of  residents  who  are  not  home  even  after  scheduling  appoint- 
ments, or  who  never  seem  to  be  in  for  telephoning. 

If  the  cards  are  not  returned,  the  inspectors  may  have  to  make 
unannounced  visits  in  order  to  complete  the  survey. 


1/  Because  refusals  stem  from  a  particular  attitude,  they  are  not  as 

serious  here  as  in  a  household  survey  whose  purpose  is  to  ascertain 
attitudes  of  respondents.  In  a  lead  paint  survey,  the  occupant  almost 
certainly  has  no  idea  of  the  lead  content  of  his  dwelling;  refusals  are 
therefore  less  likely  to  be  related  to  lead  content. 
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Figure  3 
Call-Back  Card 


Date 

Dear 

Address 

As  explained  in  a  letter  sent  to  you  by  (Your  department) , 
two  representatives  tried  to  visit  you  today  to  measure  the 
lead  content  of  the  paint  in  your  house.  Since  we  were  unable 
to  contact  you,  a  rescheduling  of  the  visit  would  be  appre- 
ciated. When  may  we  return? 

Date: 

Time:  9  10  11  12 

12  3  4  5  6 

7  8 

9  10 

11  12 

Check  the  most  convenient  time 

and 

drop 

this  card  in  the  nearest 

mailbox.  No  stamp  is  needed. 
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SUPERVISOR  FOLLOW-UP 

Each  morning,  the  inspectors  have  to  calibrate  their  lead  detection 
instruments  and  record  all  resulting  data  in  their  calibration  record 
books.  Check  to  make  sure  that  this  is  being  done  properly,  especially 
for  the  first  few  days. 

During  the  first  week  the  supervisors  should  start  "dropping  inM 
unexpectedly  on  the  inspectors  to  help  them  out  and  to  see  how  they  are 
doing.  It  will  not  take  long  to  discover  which  inspectors  especially 
need  that  occasional  visit. 

At  the  end  of  the  first  day  the  teams  should  be  gathered  together 
for  an  informal  discussion  of  the  day's  events.   This  allows  people  a 
chance  to  learn  from  each  other's  experience.  The  supervisors  can  check 
the  survey  forms  and  point  out  any  inconsistencies  or  omissions  they  find. 
Sometimes  these  meetings  afford  an  opportunity  to  praise  someone  who  has 
done  a  particularly  good  job.  These  discussions  should  certainly  be  held 
nightly  for  a  week  or  two  but  can  become  less  frequent  as  the  project 
progresses.  Do  continue  to  spot  check  survey  forms  fairly  regularly  for 
errors . 

After  the  supervisors  have  checked  the  data  collection  forms  and  are 
satisfied  that  they  have  been  completed  properly,  the  appropriate 
information  can  be  recorded  on  the  supervisor's  cards  and  the  card  can 
be  transferred  to  the  "completed"  file. 
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PART  III 
DATA  REDUCTION  AND  ANALYSIS 
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DATA  BASE  ORGANIZATION 

As  the  questionnaires  are  completed  by  the  inspectors  and  collected 
by  the  supervisors,  they  must  be  assembled  and  organized  into  a  data  base. 
Such  organization  or  "structuring"  of  the  data  is  needed  to  prepare  for 
and  facilitate  the  analyses  that  will  be  required  to  meet  the  specific 
objectives  of  the  survey.  The  structure  should  be  flexible  enough  to 
accommodate  reasonable  shifts  in  the  objectives  and  to  provide,  during 
the  survey,  information  which  is  useful  for  management. 

To  illustrate  the  sort  of  shifts  in  objective  which  should  be  anticipated, 
consider  the  term  "hazard".  In  Section  I,  the  term  "hazard"  was  used  as 
if  there  were  actually  a  specific  threshold  value  of  lead  concentration, 
above  which  it  is  hazardous,  and  below  which  it  is  innocuous.  In  fact 
lead  hazard  threshold  is  determined  by  a  legal  and  political  process 
(with  input  from  health  and  medical  experts  and  from  society  in  general) . 

This  basis  of  hazard  definition  creates  difficulties  for  the  analyst; 
he  must  anticipate  changes  in  the  definition  of  the  hazard  and  be  prepared 
to  accommodate  them.  The  system  design  should  be  such  that  changes  should 
require  little  effort. 

The  data  base  can  provide  much  information  to  support  the  internal 
management  of  the  survey.  In  addition  to  its  key  role  in  quality  control 
(to  be  discussed  explicitly  later) ,  there  are  a  number  of  potential  uses 
such  as: 

1)  Determination  of  which  of  various  techniques  (telephone  call, 
call  and  letter,  different  letters,  cold  call,  etc.)  are  most 
effective. 

2)  Determination  of  the  best  training  procedures. 
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3)  Determination  of  productivity  by  day  of  week,  weather  conditions, 
etc. 

CONSTRUCTION  AND  MAINTENANCE  OF  THE  DATA  BASE 
The  data  base  manager  has  the  responsibility  for  both  constructing 
and  maintaining  the  data  base.  The  primary  input  is  the  set  of  completed 
questionnaires,  but  there  will  be  other  inputs  as  well.  The  organization 
of  the  data  base  is  by  dwelling  unit;  each  dwelling  unit  is  represented 
as  an  "element"  of  the  data  base.  The  characteristics  of  the  dwelling 
unit  (measurements,  codes,  etc.)  are  "data  items". 

Each  element  of  the  data  base  will  contain  some  (perhaps  all)  of 
the  questionnaire  information.  Depending  upon  the  specific  objectives  of 
the  survey,  there  may  be  data  items  from  other  sources  (census,  tax  rolls, 
school  records,  etc.)  and  some  which  are  derived  or  computed  (hazard 
indices,  mean  or  median  XRF  readings,  etc.)  as  well.  Four  conceptually 
distinct  functions  are  required  to  construct  and  maintain  the  data  base.  1/ 

First,  the  questionnaire  must  be  edited.  The  information  in  the 
questionnaire  must  be  edited  for  intra -dwelling  unit  consistency;  this 
includes  checking  for  omissions  (blanks),  for  illegal  codes,  and  for 
compatibility  among  those  data  items  for  which  redundancy  has  been  designed, 
into  the  questionnaire.  The  editing  may  include  some  coding,  classifying 
or  calculation  to  produce  other  data  items  as  well. 


1/  These  functions  include  both  "editing"  and  "recoding".  Our  experience 

has  been  with  large  computerized  data  bases  in  which  editing  and 
recoding  have  been  accomplished  with  a  single  access  to  the  data  base. 
If  the  data  are  to  be  tabulated  manually  and  if  the  base  is  small,  only 
the  questionnaire  edit  is  required.  The  other  functions  do  not  require 
formal  procedures  if  the  data  base  is  small;  they  can  be  performed  on 
an  ad  hoc  basis.  The  lack  of  formal  procedures  does  not  imply  that  less 
editing  is  done,  for  a  small  sample  each  error  is  more  critical  than  it  would 
be  for  a  larger  sample;  thus  even  more  work  per  unit  is  justified  to  assure 
the  integrity  of  the  data  base. 
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Second,  the  data  base  edit  must  include  checks  for  inter-element 
consistency.  Typical  tests  include  checks  for  duplicate  serial  numbers, 
for  duplicate  elements,  and  for  the  proper  number  of  elements.  This 
editing  may  include  adding  new  elements  from  sources  other  than  the 
questionnaire . 

Third,  there  must  be  capabilities  for  deletion,  replacement,  modifi- 
cation, and  addition  of  elements  (dwelling  units) .  There  must  be  capa- 
bilities for  replacement  and  modification  of  data  items  (characteristics 
of  dwelling  units) . 

Fourth,  each  element  and  each  data  item  should  be  traceable  to  its 
source.  That  is,  direct  comparison  of  the  questionnaire  and  the  analogous 
element  of  the  data  base  should  be  possible. 

The  management  procedures  necessary  for  performing  these  functions 
must  be  designed  and  implemented  prior  to  the  data  collection  phase  of 
the  survey. 

Editing  of  the  questionnaires  is  a  major  factor  in  assuring  the 
quality  of  the  data  base.  Proper  editing  procedures  and  appropriate  use 
of  editing  outputs  can  improve  supervisor  and  inspector  performance  during 
the  course  of  the  survey. 

In  any  survey  instrument,  there  should  be  some  redundancy.  That  is, 
much  of  the  data  should  be  derivable  from  other  data  collected  on  the  same 
questionnaire  or  on  some  other  questionnaire  which  may  be  associated  in 
some  logical  way.  There  are  also  characteristics  of  the  questionnaire 
which  can  be  checked  for  consistency.  Use  of  redundancy  and  consistency 
checking  improves  data  base  quality  in  several  ways. 

First,  it  is  possible  to  "manufacture"  valid  data  under  some  circum- 
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stances.  For  example,  in  the  sample  questionnaire  (Figure  1),  if  the  code 
for  exterior  wall  material  in  the  section  showing  the  exterior  condition  and 
XRF  readings  is  left  blank,  the  proper  value  can  be  manufactured  from  the 
field  XS,  the  exterior  surface  material.  Examples  of  consistency  checking 
include  checking  for  blank  fields,  for  illegal  codes,  and  for  missing 
lines.  Every  dwelling  unit  must  have  a  kitchen;  every  single  family 
dwelling  unit  must  have  an  accessible  exterior. 

Second,  the  errors  found  by  consistency  and  redundancy  checking  enable 
the  supervisors  to  determine  how  well  the  inspectors  are  performing.  A 
consistently  high  error  rate  among  all  inspection  teams  indicates  a  need 
for  more  training  or  more  supervision.  If  the  errors  are  more  frequent  for 
one  particular  check,  there  may  be  an  ambiguity  in  the  questionnaire  which 
is  causing  the  difficulty.  If  the  error  rate  varies  among  the  inspection 
teams,  the  supervisors  will  know  which  teams  need  more  supervision. 

Third,  the  check  enables  the  individual  inspection  team  to  estimate 
how  well  its  inspections  are  being  performed.  This  self -evaluation  is 
possible  on  both  absolute  (number  of  inconsistencies)  and  relative  (incon- 
sistencies compared  to  those  of  other  teams)  scales.  If  the  checked  results 
can  be  returned  to  the  inspector  quickly,  there  may  be  a  very  favorable 
impact  on  the  morale  of  the  inspectors.  We  have  found  that  the  inspection 
team  is  isolated  in  its  work  and  may  feel  that  the  data  being  collected  are 
not  being  used,  are  useless  or  are  being  lost.  Since  meaningful  analyses 
cannot  be  performed  until  the  data  base  is  complete,  any  interim  results, 
particularly  as  they  relate  to  individual  performance,  can  help  to  alleviate 
feelings  of  isolation  and  futility. 

Fourth,  we  assume,  from  necessity,  the  frequency  of  uncheckable 

47 


errors  and  the  frequency  of  checkable  errors  are  related.  If  the  error 
rate  as  measured  by  consistency  and  redundancy  errors,  is  low,  there  is 
some  assurance  that  errors  which  cannot  be  checked  are  also  infrequent. 

It  is  important  to  return  the  results  of  the  questionnaire  edit  to 
the  supervisors  (and  inspectors)  very  quickly.  Many  errors  are  such  that 
no  additional  field  work  is  required  to  make  the  appropriate  correction  if 
the  error  report  is  received  quickly  enough.  For  example,  an  inspector 
would  probably  remember  that  the  house  at  123  South  Main  Street,  which  he 
inspected  the  previous  day,  was  a  frame  unit  built  prior  to  1940.  Put 
he  might  be  unable  to  recall  the  age  and  construction  type  of  the  unit 
at  123  North  Main  Street  which  was  inspected  three  weeks  ago  last 
Tuesday. 

To  permit  interim  processing  (i.e.,  exploratory  analyses  prior  to 
the  completion  of  data  collection) ,  every  field  within  a  record  must  be 
recognized  as  legitimate  by  the  analysis  programs.  This  can  be  insured 
by  using  "default"  values  for  variables  or  characters  Which  have  illegal 
or  inconsistent  values.  Each  default  inserted  should  trigger  a  warning 
"message"  that  the  default  value  has  been  generated.  There  are  several 
general  classes  of  default  values;  each  of  them  requires  somewhat  different 
treatment  in  the  edit  process  and  the  remedial  procedures. 

The  first  class  is  that  for  which  no  "true"  or  "most  likely"  value 
can  be  substituted.  These  include  blank  fields  and  illegal  values. 
Examples  are: 

1)  Non -numeric  characters  where  XRF  readings  are  expected 

2)  Missing  or  illegal  codes  for  age  of  unit  or  exterior  surface. 

These  values  are  to  be  replaced  by  a  default  value  to  be  interpreted 

as  "unknown". 
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A  second  class  is  that  for  which  a  true  or  most  likely  value  can  be 
deduced.  Examples  are: 

1)  Substitution  of  the  "material"  part  of  the  "condition"  code  from 
the  exterior  description  portion  of  the  questionnaire  for  the 
field  XS. 

2)  Substitution  of  the  code  denoting  a  "good"  surface  for  a  blank 
"surface"  code. 

3)  Substitution  of  0,  1,  or  5  for  0,  I,  or  S  in  positions  where 
numeric  values  are  required. 

These  values  which  have  been  defaulted  are  subject  to  subsequent 
update  or  correction.  The  correction  may  be  supplied  by  the  data  base 
manager,  the  supervisor,  or  the  inspector.   Even  if  no  update  is  ever 
made,  some  useful  data  are  retained.  One  error  or  inconsistency  does  not 
mean  that  the  questionnaire  is  useless.  This  can  be  illustrated  by  example: 
the  age  of  dwelling  category  has  three  legitimate  values:  1,  2,  and  3. 
The  consistency  edit  should  accept  a  1,  2,  or  3;  default  an  I  to  a  1;  and 
default  any  other  value  to  a  4  which  is  then  interpreted  as  "unknown" 
for  subsequent  processing.  The  warning  message  indicates  that  the  default 
has  occurred;  first  the  data  base  manager,  then  the  supervisor  and  finally 
the  inspectors,  will  try  to  produce  the  proper  correction,  if  required, 
for  the  defaulted  value.  Even  if  the  correction  is  never  made,  the  data 
element  retains  most  of  its  value  for  the  analyses.  For  any  analysis 
which  does  not  depend  on  age  of  dwelling,  this  element  is  as  valid  as  any 
other  in  the  data  base,  whereas  for  an  analysis  which  is  dependent  on 
age,  the  sample  is  smaller  by  one  unit. 

Exactly  how  far  to  pursue  the  attempts  to  correct  defaulted  values 
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in  the  data  base  is  an  extremely  difficult  question  .   Considering 
that  two  to  three  man  hours  are  required  to  collect  data  for  a  single 
unit,  a  considerable  effort  is  justified  to  salvage  as  many  units  as 
possible.  However,  re- inspection  or  partial  reinspection  is  not  worth- 
while, both  because  of  the  costs  involved  and  the  anticipated  reaction  of 
the  occupant.  Certainly  those  corrections  which  can  be  made  without 
gathering  additional  information- -those  dependent  upon  the  original  ques- 
tionnaire, the  supervisor's  record  and  the  inspector's  memory,  should  be 
made.  The  corrections  which  require  only  observation  of  the  exterior 
by  the  inspector  or  supervisor  or  minimal  participation  by  the  occupant 
(doorway  interview  or  telephone  interview)  should  be  made  unless  consid- 
erable cost  or  time  is  involved.  Those  corrections  which  require  rein- 
spection should  not  be  made;  the  default  values  should  remain  in  the  data 
base. 

ESTIMATING  THE  INCIDENCE  OF  LEAD  PAINT 
ON  SURFACES 

Once  the  data  have  been  collected  and  edited,  the  determination  of 
the  fraction  of  hazardous  units  in  the  sample  is  in  principle  a  simple 
counting  problem.  The  interpretation  of  the  meaning  of  this  sample 
fraction  is  done  in  much  the  same  way  as  the  estimation  of  the  required 
sample  size  -  by  use  of  tables  such  as  Tables  1  and  2  in  Part  I  of  the 
formulae  in  Appendix  B. 

Depending  on  the  measure  of  hazard  used  to  characterize  a  dwelling 
unit,  the  conceptually  simple  task  of  counting  the  hazardous  units  may 
be  arduous  and  time  consuming.  If  the  average  lead  content  of  a  number 
of  surfaces  is  used,  a  considerable  amount  of  arithmetic  is  required  simply 
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to  calculate  the  measure  for  each  dwelling  unit;  if  the  median  lead  read- 
ings is  to  be  used,  even  more  work  is  required  to  compute  the  measure. 
Typical  measures  which  could  be  used  for  any  set  of  surfaces  are: 

1  -  Highest  lead  readings 

2  -  Average. lead  readings 

3  -  Median  lead  readings-  or  some  other  percentile 

4  -  Fraction  of  surfaces  having  lead  reading  which  exceed  some 

threshold  (this  is  an  exposure  index  of  sorts.) 
The  measure  problem  is  further  complicated  by  the  fact  that  there 
are  many  sets  of  surface  populations :  walls ,  doors ,  windows ,  exterior 
surfaces,  wet  rooms,  dry  rooms,  combinations  of  surface  and  substrate 
conditions  as  well  as  demographic  characteristics  of  the  occupants. 

If  the  survey  has  a  code  enforcement  orientation,  the  measure  (or 
measures)  to  be  considered  are  those  which  are  specified  in  the  parti- 
cular code.  These  code  requirements  may  depend  on  surface  type,  surface 
condition,  lead  content,  and  occupancy  characteristics  but  are  relatively 
straightforward  and  can  be  fixed  as  a  part  of  the  establishment  of  objec- 
tives of  the  survey. 

For  a  health  oriented  survey,  the  criteria  for  the  selection  of 
appropriate  measures  are  not  nearly  so  simple.  There  is  no  generally 
accepted  medical  definition  of  a  lead  hazard  in  terms  of  condition,  lead 
content,  and  accessibility.  There  is  neither  dose-response  information 
nor  an  understanding  of  the  mechanisms  by  which  lead  is  transported  from 
lead  painted  surfaces  to  the  child.  Fortunately,  in  practice,  the 
reasonable  candidates  for  appropriate  measures  are  related;  our  experience 
in  previous  surveys  has  been  that  these  measures  are  well  correlated.  In 
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the  absence  of  any  firm  technical  definition  of  what  constitutes  a 
hazard,  two  choices  -  what  measure  or  variable  to  use,  and  the  critical 
or  threshold  value  of  that  variable  -  are  policy  decisions. 
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APPENDIX  A 
DETERMINATION  OF  SAMPLE  SIZE 
A  door  to  door,  unit  by  unit  census  of  all  dwelling  units  1/  in  a 
city  to  determine  the  lead  paint  hazard  is  an  extremely  costly  and  time 
consuming  process  which  most  jurisdictions  will  want  to  avoid.  Random 
sampling  is  faster  and  more  economical  and  can  yield  very  accurate  results. 

Random  sampling  involves  choosing  a  fraction  of  all  the  dwelling 
units  in  such  a  manner  that  each  unit  has  an  equal  chance  of  being  selected. 
By  applying  the  mathematics  of  probability  to  the  sample  findings,  the 
characteristics  of  all  the  dwellings  (lead  paint  content  in  this  case) 
can  be  estimated. 

You,  the  planners  and  administrators  of  the  lead  paint  survey, 
should  have  an  understanding  of  the  sampling  process  even  if  you  plan 
to  have  a  statistician  handle  this  portion  of  the  survey. 

There  is  a  statistical  formula  which  gives  the  appropriate  sample 
size  (n)  for  any  desired  confidence  level.  This  formula  involves  the 
following  quantities: 

n  =  number  of  units  in  the  sample 
N  =  number  of  units  in  the  total  population 
P  =  the  fraction  of  the  total  population  which  is  hazardous. 
P,  of  course,  is  not  known;  we  are  performing  the  survey  to  deter- 
mine P.  However,  we  can  use  the  best  current  estimator  of  P  which  we 
call  p. 

Q  =  the  fraction  of  the  total  population  which  is  not  hazard- 

l/'TJwellingTJnir'  is  defined  by  the  U.S.  Bureau  of  the  Census  as  a  group  of 
""rooms  or  a  single  room  occupied  or  intended  for  occupancy  as  "separate  living 
quarters"  by  a  family  or  other  group  of  persons  living  together  or  by  a  person 
living  alone. 
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ous.  Note  that  Q  =  1  -  P,  and  that  just  as  p  is  used  as 
an  estimate  of  P,  so  q  =  1  -  p  is  used  as  an  estimate  of  Q. 
E  =  The  maximum  error  as  a  fraction  of  N  which  can  be  tolerated 

in  the  estimate  of  P. 
k  =  A  number  related  to  the  required  confidence  level  (e.g.,  90%, 
95%  etc.)  The  appropriate  value  of  k  is  obtained  from  a  table 
such  as  Table  4  (page  58) . 
All  of  these  quantities  except  P  and  n  are  known,  but  we  have  an 
estimator,  p,  which  may  be  substituted  into  the  theoretically  correct 

k2NPQ 

n  "  k'^PQ  +  (N-l)  E2  to  yield 

n=_^9__ 

k  pq  +  (N-l)E^  which  we  use  to  calculate  n. 

If  p  is  a  poor  estimator  of  P,  we  have  a  poor  approximation  to  n,  but  the 

value  of  n  is  relatively  insensitive  to  changes  in  p.  A  change  in  p  produces 

a  smaller  change  in  n.  Table  3  illustrates  this:  consider  N  =  100,000; 

if  p  =  .2,  n  =  256;  if  p  =  .4  (changed  by  a  factor  of  2) ,  n  =  383  (changed 

by  only  a  factor  of  1.5).  Our  actual  procedure  in  calculating  n  is  to  use 

the  best  available  data  and  judgment  to  establish  a  preliminary  estimate  for 

p,  then  solve  for  n.  As  the  data  are  collected  we  periodically  recalculate 

n  using  the  fraction  of  the  inspected,  units  which  are  hazardous  as  an  estimate 

for  p.  This  is  a  trial  and  error  method  with  the  important  property  that  each 

trial  produces  a  better  estimate  for  p  than  its  predecessor. 

If  there  is  no  reasonable  initial  estimate  of  p,  an  upper  bound  on  the 

size  of  n  can  be  established  by  setting  p  =  .5.  Table  3  illustrates  that 

even  if  this  estimate  is  off  by  a  factor  of  5,  the  calculated  n  is  well  within 

a  factor  of  2  of  the  true  n. 

Before  proceeding  to  examples,  we  will  explain  the  quantity  k  which 
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appears  in  the  sample  size  formula.  Both  experience  and  theoretical 
considerations  lead  to  the  conclusion  that  the  probabilities  of  errors  of 
different  sizes,  in  a  great  variety  of  situations,  are  accurately 
representable  by  the  bell -shaped  or  "normal"  curve  shown  in  Figure  1. 
The  total  area  under  the  curve  is  taken  to  be  100%  of  all  possible 
errors,  both  with  respect  to  error  size  and  error  frequency.  Errors  of 
greater  and  greater  size,  whether  positive  or  negative,  are  less  and  less 
probable,  corresponding  to  the  curve's  falling  away  steadily  from  the 
peak.  Normal  probability  curves  differ  in  the  rapidity  of  this  falling 
away;  associated  with  each  curve  is  a  number  a  (the  "standard  deviation"), 
which  is  a  measure  of  the  steepness  of  the  decline.  The  probability  of 
an  error  greater  than  a   in  magnitude  is  roughly  11.1^%   (this  is  2  x  13.591 
+  2  x  2.14%  +  2  x  0.14%  as  illustrated  in  Figure  1);  the  probability  of 
an  error  greater  than  2a  is  about  4.54%;  and  an  error  greater  than  3a 
will  occur  less  than  0.28%  of  the  time.  The  coefficents  1,  2,  and  3  in 
la,  2a,  and  3a  therefore  correspond  to  respective  levels  of  confidence 
of  68.26%  (2  x  34.13%),  95.46%  (2  x  34.13%  +  2  x  13.59%),  and  99.72% 
that  the  error  will  not  exceed  the  stipulated  size.  Turning  the  situation 
around,  we  can  fix  a  desired  level  of  confidence,  say  90%,  and  from  the 
normal  curve  or  a  table  such  as  Table  4,  find  the  particular  k  such  that 
the  probability  that  the  error  will  not  exceed  ka  is  just  90%. 
Specifically, for  75%,  k  =  1.150;  for  90%,  k  =  1.645;  for  95%,  k  =  1.960; 
and  for  99%,  k  =  2.576. 

In  our  situation,  the  "error"  which  is  distributed  according  to  the 
normal  curve  is  the  discrepancy  between  P,  the  fraction  of  hazardous 
units  of  the  total  population,  and  p,  the  fraction  of  hazardous  units 
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Table  4 
Percentage  of  Normal  Distribution  Lying  Within  Multiples  of  a 
{C%   are  within  ko   of  the  mean) 

k  C(%) 

1.0  78.3 

1.1  72.9 

1.2  77.0 

1.3  80.6 

1.4  83.8 

1.5  86.6 

1.6  89.0 

1.7  91.1 

1.8  92.8 

1.9  94.3 

2.0  95.4 

2.1  96.4 

2.2  97.2 

2.3  97.9 

2.4  98.4 

2.5  98.8 

2.6  99.1 

2.7  99.3 

2.8  99.5 

2.9  99.6 

3.0  99.7 

3.1  99.8 

3.2  99.9 
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in  the  sample. 


.141 

2.14% 

13.59% 
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Figure  1 
Example  1: 

To  determine  the  size  of  the  sample  in  a  population  of  25,000  dwel- 
ling units  where  it  is  desired  to  state  with  95  percent  confidence  (2a) 
that  you  are  within  5  percent  of  the  total  number  of  hazardous  units, 
you  must  first  make  a  preliminary  estimate  (p)  of  the  fraction  of  units 
that  are  hazardous.  For  this  example  we  will  assume  that  half  the  units 
are  hazardous  (p  =  0.5)  and  half  are  not  hazardous  (q  =  0.5),  so  that  we 
will  get  the  largest  possible  value  of  pq,  (.25)  as  is  shown  below: 


_Q_ 


PQ 


0.1 

0.9 

.09 

0.2 

0.8 

.16 

0.3 

0.7 

.21 

0.4 

0.6 

.24 

0.5 

0.5 

.25 

0.6 

0.4 

.24 

0.7 

0.3 

.21 

0.8 

0.2 

.16 

0.9 

0.1 

.09 
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The  higher  the  value  of  PQ  the  larger  the  sample  required  and  the  more 
expensive  the  survey.  Using  the  equation  we  have  already  introduced, 

_  k2NPQ 
kzPQ  +  (N-l)  E2  and  substituting, 

n  =    (2)2(25,000)(0.5)(0.5) _ 

(2)  (0.5) (0.5)  +  (24,999) (0. 05) 2 

Thus,  394  dwelling  units  should  be  included  in  the  sample. 
Example  2: 

Table  3  (page  57)  illustrates,  still  using  95%  confidence  and  a  10% 
interval  width,  the  effect  on  sample  size  of  changes  in  P  and  N.  Note 
that  while  n  varies  with  both  P  and  N,  the  variation  with  P  is  far  greater. 
Note  also  that  for  a  given  P,  once  N  is  as  large  as  10,000,  a  larger  N 
affects  n  by  at  most  4% . 
Example  3: 

For  this  example  a  difference  not  exceeding  10  percent  is  required 
with  99.7  percent  confidence  (3  ),  with  N  =  25,000  and  P  =  0.5. 

n  =     (3)2(25,000)(0.5)(0.5) =  223 

(3)2(0.5)(0.5)  +  (25,000)  (0.10)2 
The  larger  the  acceptable  difference,  the  smaller  the  sample  size  needed 
for  the  desired  confidence.  From  these  examples  we  see  how  important 
our  estimates  are  in  choosing  an  appropriate  sample  size.  The  sample 
size  is  heavily  dependent  on  the  fraction  of  hazardous  dwellings.  The 
pretest  can  be  used  to  validate  your  original  estimates  so  you  can  be 
assured  that  your  sample  size  is  acceptable. 

Suppose  that  after  sufficient  data  have  been  collected  and  analyzed, 
the  value  of  p  is  found  to  be  0.2,  rather  than  the  original  estimate  (P) 
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of  0.5.  Since  0.5  gives  the  largest  possible  value  of  PQ  and  the  largest 
sample,  you  have  oversampled.  This  error,  though  costly,  will  increase 
the  level  of  confidence  in  your  findings.  Your  statistician  can  help 
in  determining  this. 
Example  4: 

Just  to  see  how  the  value  of  P  can  affect  the  sample  size,  let's 
recalculate  Example  1  using  0.2  instead  of  0.5  as  the  value  of  P. 

(2)2(25,000)(0.2)(0.8) _ 

n  =  — ; o —  "  254 

(2)2(0.2)  (0.8)  +  (25,000)  (.05)z 

This  is  140  fewer  units  than  were  necessary  had  P  been  0.5  (See  Example 

1). 

If  the  opposite  situation  occurs  and  the  sample  size  is  too  small 
for  the  specified  degree  of  confidence,  additional  sampling  will  be 
necessary. 

STRATIFICATION  FOR  SAMPLING 

The  construction,  type  of  usage  (renter  or  owner  occupied) ,  and  size 
of  housing  unit  reflect  the  method  of  design  and  construction  used  for 
them  as  well  as  the  probable  type  of  maintenance  received  over  the  years. 
All  these  are  clues  to  the  probability  of  presence  of  a  lead  paint  hazard. 

We  may  wish  to  treat  each  subpopulation  of  interest  as  a  total 
population  in  itself  and  make  separate  estimates  of  p  for  each  of  the 
subpopulations  or  cells.  This  approach  requires  that  we  have  a  sample 
size  from  each  such  cell.  It  may  not  be  possible  to  select  samples 
from  stratified  lists  (i.e.,  separate  lists  for  each  cell),  however, 
since  this  information  is  not  always  available.  The  samples  may  be 
drawn  from  an  aggregated  list  by  use  of  the  procedure  to  be  described. 
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The  following  is  a  table  of  cells  typical  of  a  lead  hazard  survey. 


Category 

Year  Built 

Pre- 
1939 

1940- 
1959 

Post- 
1960 

A.  Owner  Occupied 
Structures 

B.  1-4  Unit 
Structures 

. 

C.  5-19  Unit 
Structures 

D.  Over  20  Unit 
Structures 

Using  the  values  from  Example  1,  we  can  demonstrate  the  process 
required  if  stratified  lists  are  not  available. 

N  =  25,000 

k  =  2 

E  =  .05 

P  =  as  listed  below  for  each  cell 
Assume  that  from  some  other  source,  the  population  N  can  be  disaggregated 
as  shown. 


Category 

Single  Family 
Detached 

Multifamily 

Dwelling  Units 

Built 
Prior  to  1939 

Built 
1940  to  Present 

Cell  I 
10,000 

Cell  II 
5,000 

Cell  III 
4,000 

Cell  IV 
6,000 
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Suppose  that  for  Cell  I  we  have  no  idea  of  the  value  of  P.  There- 
fore, we  assume  the  worst  possible  case  (i.e.,  corresponding  to  the  largest 
possible  sample  size)  with  P  =  0.5. 

For  Cell  II  our  best  estimate  is  P  =  0.2;  for  Cell  III,  P  =  0.3; 
and  for  Cell  IV,  P  =  0.4. 

Applying  the  sample  size  formula  gives  the  following  sample  sizes; 


Category 

Dwelling  Units 

Built 
Prior  to  1939 

Built 
1940  to  Present 

Single  Family 
Detached 

Cell  I 
385 

Cell  II 
237 

Multifamily 

Cell  III 
310 

Cell  rv 
361 

We  must  (assuming  our  estimates  for  the  respective  p's  are  not 
modified)  eventually  inspect  1306  dwelling  units;  these  must  include  385 
in  Cell  I,  337  in  Cell  II,  310  in  Cell  III,  and  361  in  Cell  IV.  We 
begin  by  drawing  a  sample,  from  an  aggregated  list,  of  some  size  less 
than  1306,  say  1000  dwelling  units.  Assume  that  after  the  1000  units 
are  inspected,  we  find  that  their  distribution  among  the  cells,  and  the 
p  associated  with  the  cells,  are  as  follows: 
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Category 


Single  Family- 
Detached 


Dwelling  Units 


Built 
Prior  to  1939 


Cell  I 


322;  p  =  .2 


L 


Multifamily 


Cell  III 
226;  p  =  .3 


Built 
1940  to  Present 


Cell  II 


200;  p  =  .2 


Cell  IV 

252;  p  =  .25 


We  discover  that  our  estimate  for  P  in  Cell  I  was  rather  poor,  so 
we  recalculate  the  sample  size  using  p  as  the  estimate  for  P.  The  n  thus 
calculated  is  250,  which  is  less  than  the  322  we  have  inspected.  Cell  I 
is  completed  (and  we've  actually  oversampled)  even  though  the  322  units 
are  fewer  than  we  had  originally  estimated. 

For  Cell  II,  our  estimate  for  P  was  quite  good;  the  estimated 
sample  size  required  is  unchanged  at  237  units.  Cur  subsequent  samples 
must  include  37  units  in  Cell  II.  Similarly  our  estimate  of  p  for  Cell  III 
was  good  and  Cell  III  requires  an  additional  84  units. 

Using  the  p  for  Cell  IV  we  recalculate  n  and  get  a  new  value  of 
301;  thus  we  need  49  units  in  Cell  IV. 

We  now  draw  a  second  sample  in  an  attempt  to  fill  one  or  more  of 
the  unfilled  cells.  In  the  second  sample  all  units  falling  in  Cell  I 
will  be  discarded,  but  all  other  units  in  the  sample  mast  be  included, 
because  of  the  equally  likely  principle,  to  avoid  a  bias.  We  must 
therefore  determine  a  "good"  size  for  the  second  sample  in  order  to 
avoid  unnecessary  costs  which  would  be  incurred  by  overshooting  the 
required  sample  size. 
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Based  on  the  original  total  population  we  expect  20%  of  a  siinple 
random  sample  from  our  list  to  belong  to  Cell  II,  16%  to  belong  to  Cell  III 
and  24%  to  belong  to  Cell  IV.  In  order  to  produce  an  efficient  sample 
size,  we  form  the  ratios  of  the  supplementary  requirements  for  each 
cell  to  the  fraction  of  an  unstratified  sample  expected  to  fall  into 
that  cell.  These  are: 

for  Cell  II,   37 

.2  =  185, 

for  Cell  III,   84 

TIF  =  525,  and 

for  Cell  IV,   49 

72T  =  205. 

Using  the  minimum  from  among  these  ratios  *  ,  185,  we  add  a  small  increment, 
say  15,  to  avoid  an  undersample.  We  expect  that  a  second  sample  of  200 
units  will  fill  Cell  II;  it  will  also  add  about  35  units  to  Cell  III 
and  about  45  units  to  Cell  IV. 

The  procedure  used  for  the  second  sample  may  be  iterated  until  all 
cells  are  completed.  Each  iteration  should  complete  at  least  one 
unfilled  cell. 

Some  "ground  rules"  are  required  to  accommodate  the  idiosyncrasies 
of  a  city  directory.  These  ground  rules  are: 

1.  Any  commercial  or  industrial  address  selected  is  discarded. 

2.  Any  blank  (non  address)  line  selected  is  discarded. 


*If  the  cell  population  estimates  are  suspect,  the  denominators  of  these 
ratios  may  be  replaced  by  the  fractions  of  the  first  sample  which  belonged 
to  the  cell. 
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3.  Any  line  selected  which  is  not  the  first  line  of  an  address  is 
discarded. 

Several  methods  for  the  drawing  of  a  sample  are  illustrated  by- 
example.  There  are  advantages  and  disadvantages  to  each;  the  method  to 
select  depends  on  the  objectives  of  the  survey,  characteristics  of  the 
population  list,  the  equipment  available,  etc. 

For  each  of  the  methods  we  will  assume  a  city  of  600,000  dwelling 
units,  with  5,000  units  to  be  selected  for  a  sample.  Our  city  directory 
has  1450  pages  with  five  columns  per  page  and  100  lines  per  column.  Note 
that  there  are  250  pages  more  than  are  required  to  contain  the  dwellings; 
these  are  due  to  commercial  addresses;  multi  line  listings;  interspersed 
blank  lines  and  short  pages. 

Method  1 

i 

Step  1  -  Calculate  the  total  number  of  entries,  n  ,  from  the 
directory,  such  that  5000  (or  n)  could  be  expected  to  be  dwelling  units. 
5000     1200    (pages  to  contain  600,000  units) 
n       1450  (total  pages) 

145 
n  =   J2TT  x  5000  =  6042 

Step  2  -  Divide  the  total  number  of  entries  in  the  directory  (pages 

x  columns  x  lines)  by  n  to  get  I. 


I  =  1450  x  5  x  100 

UWl  =  120 
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Step  3  -  Using  a  random  number  tablel  choose  a  three  digit  (the 
same  number  of  digits  as  are  in  I)  sequence  of  random  digits.  If  this 
three  digit  number  is  greater  than  I,  discard  it  and  choose  another 
three  digit  sequence  until  one  is  found  which  is  120  or  less.  This 
number  becomes  J.  For  example  if  our  random  number  table  is: 

314    159    102     271 

314  is  greater  than  120 

159  is  greater  than  120 

102  is  less  than  120  and  becomes  J. 
Step  4  -  Our  sample  is  now  defined  as  the  Jth  entry  in  the  list 
plus  the  (J  +  I)th  entry,  the  (J  +  2I)th  etc  up  to  the  (J  +  (n  -  l)I)th 
entry.  Now  we  must  simply  translate  both  I  and  J  from  index  numbers  to 
their  equivalent  in  page,  column  and  line.  I  is  120  corresponding  to  no 
pages,  one  column,  and  20  lines.  J  is  102  or  no  pages,  one  column,. and  2  lines. 
The  first  few  units  of  our  sample  are: 

Location  In  Directory 


Index 

Page 

Column 

Line 

102 

1 

2 

2 

222 

1 

3 

22 

342 

1 

4 

42 

462 

1 

5 

62 

582 

2 

1 

82 

602 

2 

2 

2 

722 

2 

3 

22 

1/  For  example  the  Rand  Corporation,  A  Million  Random  Digits  with  100,000 

Normal  Deviates,  the  Free  Press,  New  York;  Collier -MacMillan  Limited,  London 
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This  method  has  the  advantage  of  being  conceptually  simple;  the 
drawing  of  a  single  random  number  fixes  the  entire  sample.  Other 
advantages  are  that  the  units  selected  are  in  directory  order  (and  thus 
approximately  geographical  which  implies  that  any  sequence  forms  a 
reasonable  inspector  itinerary),  and  that  there  will  be  no  duplicates. 
It  has  two  disadvantages;  first,  while  the  entire  sample  is  random,  its 
roughly  geographical  order  implies  sequences  of  consecutive  entries  will 
not  in  general  produce  an  unbiased  sample;  second,  the  arithmetic  for 
getting  successive  entries  may  be  messy  since  non  decimal  "carries"  are 
involved  in  stepping  from  one  unit  to  the  next.  The  method  requires 
substantial  personnel  time  for  large  samples. 

Method  2  -  For  this  procedure,  an  m  digit  sequence  of  numbers  in  which 
each  digit  is  0,  is  interpreted  as  a  1  followed  by  m  zeroes.  For  the  cases 
in  which  we  are  interested  in  our  example,  0000  becomes  10000;  0  becomes 
10;  and  00  becomes  100. 

Step  1  -  Determine  the  largest  multiple  of  the  number  of  pages  which 
is  less  than  or  equal  to  the  largest  number  which  can  be  expressed  in  the 
same  number  of  digits  contained  in  the  number  of  pages.  In  this  case,  the 
number  of  digits  in  the  number  of  pages  is  four;  four  zeroes  (oooo)  is  taken 
to  represent  10000;  1450  is  contained  in  10000  six  times. 

Step  2  -  Determine  the  largest  multiple  of  the  number  of  columns  which 
can  be  expressed  in  a  single  column.  In  this  case,  2  x  5  =  10;  thus  2  is 
the  largest  multiple . 

Step  3  -  Determine  the  largest  multiple  of  the  number  of  lines  per  column; 

100,  which  can  be  contained  in  a  two  digit  number.  In  this  case  the  answer 

is  1  (remember  that  00  is  interpreted  as  100) . 
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Step  4  (to  be  repeated  6042  times) 

a.  Choose  a  four  digit  random  number  to  select  the  page.  If  this 
number  is  greater  than  8700  (6  x  1450) .discard  it  and  select  another  number. 
If  it  is  less  than  or  equal  to  1450,  it  becomes  the  page  index;  if  it  is 
greater  than  1450,  divide  the  number  by  1450  and  use  the  remainder  as 

the  page  index. 

b.  Choose  a  random  digit  to  determine  the  column  index.  In  this 
case,  since  5  is  an  even  multiple  of  10,  a  random  digit  of  1  or  6  selects 
column  1;  2  or  7,  column  2;  3  or  8,  column  3;  4  or  9,  column  4;  and  5 

or  10,  column  5. 

c.  Choose  a  two  digit  random  number  to  select  the  line.  In  this 
case,  the  random  number  is  the  line. 

d.  Combine  the  three  numbers  thus  selected;  now  we  have  a  page, 
column,  and  line  which  fix  a  directory  address. 

For  example,  if  our  sequence  of  random  digits  is: 

912    314    159    265    427    182    849    045 
For  the  first  entry 

4a   9123  is  larger  than  8700;  we  discard 

1415  is  less  than  8700  and  less  than  1450;  it  becomes  our  page 
number . 

4b   9  denotes  column  4 

4c   26  denotes  line  26 

4d   our  first  entry  is  page  1415,  column  4,  line  26. 
For  the  second  entry 

4a   5427  is  less  than  8700,  greater  than  1450 
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1450  )  5427 

4550 
WTT 

1077  becomes  the  page  number 
4b   1  denotes  column  1 
4c  82  denotes  line  82 

4d   our  second  entry  is  page  1077,  column  1,  line  82. 
Method  5 

This  procedure  is  based  on  a  list  of  random  numbers  R, ,  R~,  R, etc. 

each  of  which  is  greater  than  or  equal  to  zero  but  less  than  one. 
Step  1  -  our  page  number  is  the  largest  integer  contained  in 
R-,  x  number  of  pages  +  1 

Step  2  -  our  column  number  is  the  largest  integer  contained  in 
R2  x  number  of  columns  +  1 

Step  3  -  our  line  number  is  the  largest  integer  contained  in 
R  x  number  of  lines  +  1 

This  process  is  repeated  using  R,,  Rr,  and  R,  for  the  second  entry; 

R  ,  R  ,  and  RQ,  for  the  third,  etc.  until  the  entire  sample  has  been 

constructed. 

For  example,  if  our  list  of  R's  is: 

.14132 

.91286 

.01030 

.31415 

.92654 
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.27182 
.84904 
our  first  entry  is 

.14132  x  1450  +  1  =  204.9140  +  1  =  205.9140  or  205 

.91286  x  5  +  1  =  4.59430  +  1  =  5.59430  or  5 

.01030  x  100  +  1  =  1.030  +  1  =  2.030  or  2  or 
205-5-2 
our  second  entry  is 

.31415  x  1450  +  1  =  456 

.92654  x  5  +  1  =  5 

.27182  x  100  +  1  =  28  or 
456   -   5   -   28 

The  advantages  of  methods  2  and  3  are  that  the  arithmetic  is  simple 
and  that  any  sequence  of  selected  entries  form  a  random  sample.  The  dis- 
advantages are  that  the  sample  will  not  be  in  directory  order  and  that 
entries  may  be  duplicated;  these  methods  also  require  substantial  personnel 
time. 

Method  4  -  This  procedure  makes  use  of  a  computer  program  to  generate 
and  organize  the  samples  required.  Such  a  program  would  operate  as 
follows : 

The  program  generates  a  number  of  triples,  each  triple  being  a  set 
of  indices  each  of  which  is  randomly  selected,  denoting  page,  column  and 
line  of  the  directory.  Duplicates  are  eliminated  and  the  first  sample 
is  sorted  by  order  of  appearance  in  the  directory  and  printed.  This  is 
the  original  sample  to  be  used  in  the  survey.  The  remainder  of  the 
triples  are  then  sorted  by  40 -entry  blocks  into  directory  order  and  these 
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samples  are  printed. 

Each  of  the  samples  thus  generated  is  a  random  sample,  as  is  any 
combination  of  these  samples.  If  any  entry  from  a  sample  is  used,  then 
the  entire  sample  must  be  used  in  order  to  preserve  the  validity  (absence 
of  bias)  of  the  procedure. 

In  using  the  samples,  the  entire  first  sample  is  used  to  start  a 
survey.  Some  entries,  because  of  directory  anomalies  such  as  short  pages, 
commercial  addresses,  blank  lines,  etc.,  can  be  identified  in  advance 
as  not  denoting  dwelling  units;  others  will  be  found,  in  the  field,  to 
denote  non-existent  or  nonresidential  buildings.  If  the  remaining  sample 
is  smaller  than  that  required,  one  or  more  of  the  40 -entry  samples  must 
be  added  to  the  survey  sample.  This  process  of  adding  40-entry  samples 
to  the  survey  sample  may  be  iterated  until  the  survey  sample  is  of  suf- 
ficient size  for  the  desired  confidence  level. 

A  program  is  provided  in  Appendix  D  along  with  specimen  inputs  and 
outputs . 

Advantages  of  this  method  are  that  the  creation  of  the  sample  is 
economical,  quick,  and  easily  reproducible.  The  disadvantages  are  that 
there  may  be  lag  time  until  the  program  is  created  (or  adapted)  for  the 
available  computer  configuration,  and,  of  course,  access  to  the  computer 
is  required. 
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APPENDIX  B 
ESTIMATING  INCIDENCE  OF  LEAD  PAINT 

The  procedures  used  to  estimate  the  hazard  fractions  after  the  data 
collection  has  been  completed  are  similar  to  those  described  in  the  pre- 
ceding section  for  estimation  of  sample  size.  For  the  present  analyses, 
the  sample  size  and  hazard  fraction  for  the  sample  are  given;  the  precision 
with  which  the  sample  fraction  P  represents  the  total  population  fraction 
P  must  be  determined.  To  illustrate  we  will  consider  several  examples. 

First,  assume  an  extremely  small  total  population  (N=400) ,  with  a 
sample  size,  n  of  200,  and  that  the  number  of  hazardous  units  in  the  sample 
is  100.  Thus  p  =  .5  and  Tables  1  and  2  (in  Part  I)  indicate  that:  1)  there 
is  a  95%  probability  that  P  lies  between  43.1%  and  56.91,  i.e,  that  the  actual 
number  of  units  lies  between  172  and  228,  and  2)  there  is  a  90%  probability 
that  P  lies  between  44.2%  and  55.8%,  i.e.  that  the  actual  number  of  hazardous 
units  lies  between  176  and  223.  However,  since  the  confidence  ranges  in 
Tables  1  and  2  are  based  on  "large"  populations  and  our  N  of  400  is  "small", 
we  also  consult  Table  3  in  Appendix  A.  This  shows  that  there  is  at  least  a 
95%  probability  that  P  is  between  45%  and  55%  (180  to  220  hazardous  houses), 
because  a  "small"  N  means  that  n  =  200  is  relatively  quite  "large",  the  range 
for  P  is  reduced  by  3.8  percentage  points. 

As  a  second  example,  we  consider  a  case  in  which  n  does  not  appear 
in  Tables  1  and  2.  Let  N  =  5000,  p  =  .2,  and  n  =  150.  Since  we  have 
no  confidence  ranges  in  these  tables  for  n  =  150,  we  must  use  the  entry 
from  the  table  for  the  largest  sample  size  which  is  less  than  n.  In 
this  case  the  value  is  100  and  we  can  say:  1)  there  is  at  least  a  95% 
probability  that  P  lies  between  12.2%  and  27.8%,  and,  2)  there  is  at 
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least  a  90%  probability  that  P  lies  between  13.4%  and  26.8%. 

From  Table  3  we  see  that  the  effect  of  having  a  relatively  small 
N  is  not  enough  to  assure  a  95%  probability  that  P  lies  between  15% 
and  25%. 

We  can,  however,  estimate  the  confidence  interval  for  any  given 
confidence  level  (95%  in  the  example  following)  by  a  direct  calculation 
from  the  sample  size  equation: 

k2NPQ 


n  = 


k2PQ  +  (N-l)E2 


Substituting  p  for  P  and  q  for  Q  and  solving  for  E  yields: 

f2  =    k  pq      ^~n) 
h     — g — -$nr 

F2  .  (1.96)2  x  2  x  8  (5000  -  150) 
I5D        (5000  -  1  ) 

E2  =  .00399  or 

E     6.3% 

Thus  there  is  a  95%  probability  that  P  lies  between  (20.0  -  6.3%) 

and  20  +  6.3%),  i.e.,  between  13.7%  and  26.3%.  This  narrows  the 

confidence  interval  from  that  which  was  the  best  attainable  from  the 

table . 

Alternatively,  we  can  calculate  the  confidence  level  for  any 

prescribed  interval  by  solving  the  sample  size  equation  for  k.  We  give 

two  examples: 

A.  For  a  confidence  range  of  15  -  25% 

k2  =  E2n   (N-l) 


pq         (N-n) 

2  _ 

.052  x  100       (5000  • 

•  1) 

.8  x  .2           (5000  ■ 

-  100) 
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k2  =  1.60  or 
k.  =  1.3 
Using  Table  4  of  Appendix  A,  we  find  that  k  =  1.3  corresponds  to 
C  =  80.6;  thus  there  is  a  probability  of  80.6%  that  P  lies  between 
15%  and  25%. 

B.  For  a  confidence  range  of  10  -  30% 

k2  =  .l2  x  100  (5000  -  1) 

.8  x  .2   (5000  -  100) 

k2  =  6.37  or 
k  =  2.5 
Using  Table  4  we  find  that  there  is  a  98.8%  probability  that  P 
lies  between  10%  and  30%. 
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APPENDIX  C 
INSTRUCTIONS  FOR  COMPLETING  THE  SAMPLE  SURVEY  FORM 

The  instructions  for  completing  the  data  collection  were  used  in  a 
survey  in  which  some  4000  dwelling  units  were  inspected.  The  specification 
as  to  legitimate  codes,  conventions,  etc.  reflect  the  format  of  the  instru- 
ment itself,  the  editing  and  analysis  software  used,  and  the  objectives  of 
the  survey.  They  are  not  presented  as  doctrine  but  rather  as  an  example 
of  instructions  which  were  effective  in  a  particular  survey. 
General 

The  Data  Collection  Form  (NBS-744),  hereafter  abbreviated  as  DCF,  is 
the  sole  reporting  instrument  for  the  housing  portion  of  a  survey.  There 
are  several  different  types  of  data  appearing  on  the  DCF.  Some  of  the 
data  will  have  been  entered  prior  to  the  inspector's  receipt  of  the  DCF; 
others  will  be  transcribed  from  the  XRF  log  book;  the  remainder  will  be 
collected  on-site  from  the  individual  dwelling  unit. 

In  the  description  of  the  data  elements,  the  term  "field"  is  used  to 
denote  a  sequence  of  characters  or  digits  which  are  treated  as  an  entity. 
DU  is  used  as  an  abbreviation  for  dwelling  unit. 

The  DCF  consists  of  19  lines  each  pre -numbered  in  the  field  CODE.  The 
field  SERIAL  NUMBER  forms  the  unique  identification  of  the  DCF.  Generally, 
if  the  entity  defined  by  a  line  of  the  DCF  does  not  exist  (a  one  story  DU, 
for  example,  would  have  no  stairway),  that  line  is  left  completely  blank, 
when  the  data  are  collected.  The  position  of  the  relevant  (or  non-blank) 
characters  within  the  field  is  immaterial  e.g.  AAl  is  considered  identical 
to  AlA  or  1AA,  where  A  is  used  to  denote  the  character  "blank".  However, 
the  characters  must  be  successive  (a  field  such  as  1A2  is  meaningless 
whereas  A12  and  12A  are  meaningful  and  equivalent) .  Decimal  points  are 
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ignored  in  entering  XRF  readings;  a  decimal  point  is  assumed  in  subsequent 
processing.  No  particular  convention  must  be  observed  to  avoid  ambiguity 
of  l's  and  I's  or  O's  and  o's. 

The  right  hand  side  (RHS)  of  the  DCF  contains  a  comprehensive  list 
of  the  codes  required  in  the  body  of  the  form. 

When  the  inspection  is  completed,  lines  1,  2,  3  and  one  or  more  of  the 
lines  from  4  to  9  will  be  entered.  If  the  DU  did  not  exist  (mistake  in  the 
directory),  only  line  1  will  be  entered.  If  the  DU  exists  and  could  not  be 
inspected,  lines  1  and  3  will  be  entered. 
Section  I  -  Identification 
Line  01  -  This  line  is  not  the  responsibility  of  the  inspector.  All  the  fields 

here  except  "V"  will  be  filled  before  he  receives  the  form.  The 

field  "V"  will  be  filled,  if  necessary,  by  the  supervisor  after 

the  form  is  returned. 

SERIAL  NUMBER  -  the  arbitrarily  assigned  unique  identifying  number 

for  the  dwelling  unit.  It  must  be  numeric. 

TRACT  -  the  census  tract  in  which  the  DU  lies. 

BLOCK  -  the  census  block  in  which  the  DU  lies. 

ZIP  CODE  -  the  postal  ZIP  CODE  of  the  DU  mailing  address. 

V  -  the  visitation  code  which  is  entered  after  the  DCF  is  returned. 
If  an  inspection  has  been  made,  this  field  is  left  blank,  otherwise 

V  describes  the  reason  the  inspection  could  not  be  made.  The  codes 
are  listed  in  I  -  VISITATION  on  the  RHS.  The  accompanying  codes  is 
self-explanatory;  note,  however,  that  3  and  6  are  different. 
STREET  NAME  AND  NUMBER  -  self-explanatory.  This  field  is  used  only 
by  the  inspector  for  locating  the  DU.  It  will  be  dropped  in  the  sub- 
sequent construction  of  the  data  base. 
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Line  02  -  These  data  are  collected  daily  and  copied  onto  the  DCF  from  the  XRF 

Log  Book.  They  relate  to  the  day  of  inspection  rather  than  the 

specific  DU. 

XRF  SERIAL  -  the  three  character  numeric  serial  number  of  the  XRF 

instrument  used. 

TEST  BLOCK  -  the  XRF  reading  obtained  from  the  painted  panel  test 

block  included  with  each  instrument.  This  is  taken  after  the  instru- 
ment is  calibrated. 

ZERO  READING  -  the  reading  obtained  from  the  non- leaded  wood  block 

included  with  each  instrument.  This  is  taken  as  the  instrument  is 

calibrated. 

ZERO  VERNIER  -  the  zero  vernier  (dial)  setting  corresponding  to  the 

zero  reading. 

CALIBRATE  READING  -  the  reading  obtained  from  the  lead-foil-on-wood 

block  included  with  the  instrument  used  in  this  survey.  This  is  taken 

as  the  instrument  is  calibrated. 

CALIBRATE  VERNIER  -  the  vernier  setting  corresponding  to  the  calibrate 

reading. 

PEAK  READING  -  the  reading  obtained  from  the  lead- foil -on -wood  block 

included  with  each  instrument.  This  is  taken  as  the  instrument  is 

calibrated. 

PEAK  VERNIER  -  the  vernier  setting  corresponding  to  the  Peak  Reading. 

INSPECTORS  -  the  initials  of  each  of  the  members  of  the  inspecting  team 

DATE -MONTH  -  conventional  numeric  codes  1-12  for  month, 

DATE -DAY  -  day  of  month . 
Line  03  -  This  line  is  completed  on-site.  The  fields  YEAR  and  those  following 

require  the  active  participation  of  the  householder. 
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SEP  -  is  the  sequence  within  the  day,  in  which  the  DU  was  inspected 
(first,  second,  third,  etc.)'  Do  not  include  DU's  which  were 
visited  but  not  inspected. 

TEST  BLOCK  -  the  average  of  five  XKF  readings  on-site  of  the 
medium  painted  panel  test  block  included  with  the  instrument. 

TYPE  -  coded  according  to  II --Type  of  Construction- -on  the  PHS. 
If  the  type  is  mixed,  enter  the  code  for  the  predominant  type. 

XS  -  the  material  of  the  exterior  according  to  III --Outside  Surface 
of  Building- -on  the  RHS.   If  the  surface  is  mixed,  enter  the 
predominant  type  code. 

PC  -  the  code  according  to  IV- -Occupancy- -on  the  PHS.  For  multi- 
unit  buildings,  a  mailbox  count  is  a  convenient  way  of 
ascertaining  the  number  of  units. 

YEAR  -  the  code  according  to  V--Year  Built --on  the  RHS.  This 

should  be  obtained  from  the  householder  if  possible;  otherwise 

it  is  to  be  estimated  by  the  inspector. 
OWNER  -  the  code  according  to  VI--Owner-Renter--on  the  RHS. 
MORT  -  the  code  according  to  VI I- -Mortgage- -on  the  PHS. 
PUBLIC  -  the  code  according  to  VIII --Public  Housing- -on  the  RHS. 

Public  housing  is  that  which  is  operated  by  some  agency  of 

government  whether  federal,  state,  or  local. 
STTR  -  the  subsidized  housing  code  according  to  IX- -Subsidized- -on 

the  RHS.  Subsidized  housing  is  that  for  which  at  least  some 

portion  of  the  rent  is  paid  by  some  agency  of  federal,  state, 

or  local  government. 

CHILD  -  the  number  of  children,  age  6  and  under,  resident  in  the  DU. 
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Section  II  -  Interior 

This  section  of  the  DCF  includes  lines  4  through  17.  Each  of  these 
lines  has  the  same  format;  the  pre -printed  CODE  field  identifies  the 
room  type. 

No  attempt  is  to  be  made  to  move  furniture ,  pictures^  etc . ,  or  to 
stand  on  anything  to  obtain  readings.  If  the  entity  is  inaccessible 
(the  ceiling,  for  instance)  or  does  not  exist,  the  field  is  left  blank. 
Lines  04  -  17. 
Walls  and  Ceilings 

COND  -  the  material  and  condition  of  the  walls,  ceiling,  etc. 
This  is  a  three  character  field. in  which  one  character 
is  a  material  code  according  to  XI  (a)  of  the  RHS;  one 
character  is  a  base  or  a  substrate  condition  code  according 
to  XI  (b)  of  the  RHS;  and  the  third  character  is  a  surface 
condition  code  according  to  XI  (c)  of  the  RHS.  The  order 
in  which  these  characters  appear  is  immaterial .  The  code 
P2Q,  for  example  is  equivalent  to  PQ2,  2PQ,  or  QP2. 

These  codes  are  for  the  room  in  general  rather  than  for 
each  specific  surface.  If  the  surfaces  are  mixed,  the 
code  which  most  nearly  characterizes  the  room  at  the  height 
of  four  feet  or  less  should  be  entered. 

Base  Condition  codes  must  come  from  a  subjective  evaluation 
of  the  painted  or  varnished  surfaces.  The  criterion  is  an 
estimation  of  the  magnitude  of  work  involved  if  redecoration 
were  considered: 

1.  If  no  work  is  required 
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2.  If  minor  -work  is  required,  (small  cracks, 
no  deep  or  large  holes.) 

3.  If  major  work  is  required.  (Large  cracks, 
deep  or  large  holes,  structural  repairs). 

Since  COND  reflects  the  general  condition  of  the  room, 
it  is  best  to  complete  this  field  after  the  room  inspec- 
tion has  been  completed. 

WALL  1.  WALL  2.  WALL  3.  WALL  4.  and  CEILING  -  each  of  these  is 

an  XRF  reading  taken  on  the  indicated  surface.  The  walls 
are  numbered  clockwise  beginning  with  the  wall  to  the  left 
of  the  entry.  The  wall  reading  should  be  taken  at  any 
point  less  than  four  feet  from  the  floor  if  such  a  point 
is  accessible.  The  ceiling  reading  is  to  be  taken  only 
if  it  is  possible  to  do  so  by  standing  on  the  floor  or 
stairs.   (Do  not  stand  on  furniture  or  counter  tops,  etc.) 
If  the  surface  is  inaccessible  enter  an  "x"  in  the  field; 
if  the  surface  does  not  exist  (three  walled  room  for  example) 
leave  the  reading  field  blank.  If  the  surface  is  not  painted 
or  varnished,  enter  a  U  rather  than  an  XRF  reading. 

TRIM  COND  -  as  with  the  COND  field  for  walls  and  ceilings ,  this 
field  must  be  for  all  trim  within  the  room  (i.e.,  windows, 
doors,  and  baseboards). 

WINDOW  NUMBER  -  Enter  the  number  of  windows  in  the  room  (M0"  if  none) 

WINDOW  READING  -  enter  the  reading  obtained  from  the  window  frame 

or  sill;  if  possible  this  should  be  from  a  vertical  surface 

within  four  feet  of  the  floor. 
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DOOR  NUMBER  -  enter  the  number  of  doors  in  the  room  ("0"  if  none) 
DOOR  READING  -  enter  the  XRF  reading  from  any  point  on  the  door's 
surface  which  is  within  four  feet  of  the  floor.  If  the  door 
is  of  variable  thickness,  take  this  reading  from  the  thick  part 
of  the  door  to  avoid  the  read  through  problem  (detecting  lead 
on  the  back  surface  of  the  door) . 
BASEBOARD  -  enter  the  XRF  reading  from  an  accessible  part  of  the 
baseboard  from  any  convenient  wall  of  the  room.  If  the  base- 
board is  inaccessible  enter' an  "x";  if  there  is  no  baseboard, 
leave  blank. 
OTHER 

FLOOR  -  enter  an  XFF  reading  only  if  the  floor  is  a  painted  one. 
RADIATOR,  CABINET,  FIREPLACE  -  enter  XFF  readings  for  each  of 
these  which  exists;  leave  blank  for  each  non-existent  entity. 
Section  III  -  Exterior 

The  exterior  section  differs  from  the  interior  in  that  there  is 
a  condition  code  for  each  XRF  reading  taken.  Criteria  for  each  condition 
are  the  same  as  for  the  interior.  Readings  are  to  be  made  only  if  the 
designated  areas  are  painted.  As  with  the  interior  readings,  the  XRF 
readings  should  be  taken  at  any  point  at  a  height  of  four  feet  or  less. 

WALL  RDG  -  the  outside  wall  of  the  dwelling  unit,  reading  taken 

from  the  predominant  exterior  surface. 
PORCH  RDG  -  the  floor  of  the  porch. 

DOOR  RDG  -  the  exterior  door.  If  the  door  is  of  variable 
thickness,  this  reading  is  taken  from  the  thick  portion. 

WINDOW  RDG  -  the  window  frame  or  sill . 

82 


RAILING  RDG  -  the  porch  or  stair  railing. 
FENCE  RDG  -  the  fence. 
GARAGE  RDG  -  the  garage  wall. 

EXCLUDED  ROOMS  -  enter  the  number  of  rooms  of  indicated  type 
which  were  not  inspected. 


83 


APPENDIX  D 

SPECIFICATIONS  FOR  SAMPLE  GENERATION  PROGRAM 
This  appendix  contains  the  specifications  for  a  computer  program  to 
generate  samples,  the  listing  of  the  program  and  an  example  of  the  output 
generated.  For  the  example,  1023  entries  were  generated  with  320  appearing 
in  the  first  sample.  The  directory  assumed  has  two  columns  with  sixty- 
lines  per  page;  the  names  appearing  beginning  on  page  6  and  run  through 
page  50;  the  starting  seed  (base  8)  =  000011111111  012345012345. 
Sample  Generator 

The  sample  generator  creates  a  set  of  samples  from  a  hard  copy  directory. 
It  is  parameterized  so  that  any  directory  may  be  used.  The  parameters: 

1.  Page  number  of  first  page  containing  addresses  (must  be  >1  and  <1024) 

2.  Page  number  of  last  page  containing  addresses  (must  be  ^1  and  <1024) 

3.  Number  of  columns  per  page  (must  be  >1  and  <8) 

4.  Number  of  lines  per  page  (must  be  >1  and  <512) 

5.  Size  of  first  sample  (must  be  <8190;  the  sample  produced  will  be  the 
least  multiple  of  160  which  is  greater  than  or  equal  to  the  size 
specified) 

The  program  generates  8191  triples,  every  triple  being  a  set  of  randomly 
generated  indices  denoting  page,  column  and  line  of  the  directory.  Duplicate 
triples  are  eliminated  and  the  first  sample  set  is  put  into  directory  sort 
and  printed.  This  is  the  original  sample  to  be  used  in  the  survey.  The 
remainder  of  the  triples  are  then  sorted  by  forty  entry  blocks  into  directory 
sort  and  these  sets  are  printed  as  extras. 

Each  of  the  sample  sets  thus  generated  is  a  random  sample  as  is 
any  combination  of  these  sets.  In  order  to  preserve  the  validity,  if 
any  entry  from  a  group  is  used  that  entire  sample  set  must  be  used: 
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otherwise  a  bias  occurs  (see  Section  I  on  sampling). 

Because  of  inconsistencies  in  the  directory  (short  pages,  commercial 
addresses,  blank  lines,  etc.)  and  other  problems  which  may  be  discovered 
in  the  field  (non-existent  or  non-residential  buildings) ,  extra  sets 
may  have  to  be  added  to  the  survey  sample  until  the  reauired  number  of 
units  are  obtained. 

This  program  consists  of  a  main  program  (DIRECT)  and  a  subroutine 
(SORT)  both  written  in  FORTRAN  V,  and  a  function  subroutine  (RAND)  in 
assembly  language.  The  FORTRAN  programs  should  be  readily  transportable 
to  any  computer  system  of  sufficient  size  with  word  length  of  36  bits  or 
more.  On  another  system,  RAND  must  be  replaced  by  a  function  which  cal- 
culates a  random  variable,  x,  0<x<l,  with  a  statement: 

X  =  RANDNO  (0,  SEED)  where  SEED  is  changed  within  the  function  subroutine 
to  advance  the  random  variable  with  successive  calls.  The  method  used 
in  RAND  is  the  conventional  multiplicative  one  which  requires  input  of 
a  starting  value  of  SEED  and  provides  for  generation  of  a  different 
sequence  of  random  variables  on  subsequent  runs  by  printing  the  last 
value  of  SEED  at  run  termination. 

Required  Input 

1.  Seed  §  sentinel  (Format  3013) 

a.  High  order  part  of  SEED 

b.  Low  order  part  of  SEED 

c.  Sentinel  of  all  binary  l's  (377777777777) 

2.  Directory  parameters  (Format  515) 

a.  IPS  Starting  page 

b.  IPN  Last  page 
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c. 

IC  Number  of  columns/page 

d. 

IL  Number  of  lines/page 

e. 

IFIRST  Size  of  first  sample 

Output : 

INPUT  PRINT 

a 

1 

a0                        a 
z           3 

6i 

So          G, 
2             3 

NEXT  VALUE  OF  SEED  y       y       THERE  APE  6  UNIQUE  UNITS  followed  by  the 

l   2 

samples  generated. 

a  is  12  character  octal  high  order  part  of  initial  SEED 

a  is  12  character  octal  low  order  part  of  initial  SEED 

a  is  377777777777  (a  -  a   are  print  back  of  seed  §  sentinel  input) 

3  1    3 

6  is  IPS 
l 

6  is  IPN 
2 

3  is  IC 
3 

3  is  IL 

3  is  IFIRST  (3-6  are  print- back  of  directory  parameters.") 

5  1   5 

Y  is  12  character  octal  high  order  part  of  last  SEED 

Y  is  12  character  octal  low  order  part  of  last  SEED 

<5   is  the  number  of  distinct  units  in  the  8191  unit  sample  after 
duplicates  have  been  eliminated. 
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